{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# pandas 进阶修炼 ｜早起Python\n",
    "<br>\n",
    "\n",
    "**本习题由公众号【早起Python & 可视化图鉴】 原创，转载及其他形式合作请与我们联系（微信号`sshs321`)，未经授权严禁搬运及二次创作，侵权必究！**\n",
    "\n",
    "\n",
    "\n",
    "本习题基于 `pandas` 版本 `1.1.3`，所有内容应当在 `Jupyter Notebook` 中执行以获得最佳效果。\n",
    "\n",
    "不同版本之间写法可能会有少许不同，如若碰到此情况，你应该学会如何自行检索解决。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6 - 数据分组与聚合\n",
    "\n",
    "\n",
    "\n",
    "<br>\n",
    "\n",
    "**<font color = '#5172F0'><font size=3.5>必读👇👇👇**</font>\n",
    "\n",
    "在前面 5 节的习题中，大多是关于利用 pandas 进行**数据处理**的操作。\n",
    "\n",
    "现在，终于来到**数据分析**部分。\n",
    "\n",
    "而**数据的分组与聚合**，也是在数据分析中十分高频的过程。\n",
    "\n",
    "本节习题我整理了一些利用 pandas 进行数据的分组与聚合的常用操作。\n",
    "\n",
    "注意，为了更清晰的表达我的问题，本节习题中大部分习题会**保留运行结果**，也是我题目预期产生的结果，如果不理解题目所问，可以进行参考。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面我为你梳理了 DataFrame 操作链路中涉及的核心类型及其转换关系。\n",
    "\n",
    "下表汇总了这些关键类型的要点，你可以先建立一个整体印象。\n",
    "\n",
    "| 类型名称 | 产生场景 (示例方法) | 核心特征与目的 | 后续常见操作 |\n",
    "| :--- | :--- | :--- | :--- |\n",
    "| **DataFrame** | 数据入口 (`pd.read_csv()`, `pd.DataFrame()`) | 二维表格结构，具有行索引和列标签。 | 几乎所有操作，如筛选、赋值、聚合等。 |\n",
    "| **Series** | 选取单列 (`df['col']`)，聚合操作（如 `df.sum()`） | 一维带标签数组，可看作单列 DataFrame。 | 向量化运算、`.apply()`、字符串方法（`.str`）等。 |\n",
    "| **DataFrameGroupBy** | 按列分组 (`df.groupby('key')`) | 表示“惰性”的分组状态，本身不直接计算。 | 聚合（`.sum()`, `.mean()`）、转换（`.transform()`）、过滤（`.filter()`）。 |\n",
    "| **SeriesGroupBy** | 按列分组后选取单列 (`df.groupby('key')['col']`) | 单列分组的“惰性”状态。 | 聚合（`.sum()`）、转换（`.transform()`）。 |\n",
    "| **Resampler** | 对时间序列重采样 (`df.resample('M')`) | 针对时间序列的“惰性”分组状态，按时间频率分组。 | 时间序列聚合（`.sum()`, `.ohlc()`）。 |\n",
    "| **pd.core.groupby.generic.DataFrame** | 对 `DataFrameGroupBy` 对象执行聚合操作 | 聚合操作后返回的标准 DataFrame。 | 标准的 DataFrame 操作。 |\n",
    "\n",
    "### 💡 理解链式操作中的类型流转\n",
    "\n",
    "在链式操作中，数据类型的转换遵循一条比较清晰的路径。一个典型的链式操作流程可以概括为：**起始于一个DataFrame，在过程中可能会转换为各种中间对象（如GroupBy对象），最终又回归到一个DataFrame（或Series）以供下一步操作或输出结果。**\n",
    "\n",
    "这种流转特性是链式编程强大表现力的基础。例如，在 `df.groupby('key').agg({'col': 'mean'}).query('col > 10')` 这个链式中，数据就经历了 `DataFrame -> DataFrameGroupBy -> DataFrame -> DataFrame` 的转换过程。\n",
    "\n",
    "掌握这些类型的转换，能让你在编写链式操作时更加得心应手，清晰地预判每一步操作后的结果。\n",
    "\n",
    "### 💎 核心原则与实用技巧\n",
    "\n",
    "1.  **惰性求值对象是枢纽**：`DataFrameGroupBy`、`SeriesGroupBy`、`Resampler` 这类对象是链式操作中的关键“枢纽”。它们暂存了分组规则，直到遇到一个“行动”（如 `agg`, `transform`）才会触发实际计算，并返回新的 `DataFrame` 或 `Series`。\n",
    "2.  **利用管道整合自定义函数**：当链式操作需要插入复杂的自定义处理逻辑时，可以使用 `pipe()` 方法。它能将中间结果（DataFrame 或 Series）流畅地传递给任何自定义函数，并保持链式结构，极大增强了灵活性。\n",
    "3.  **注意索引的变化**：分组、重采样等操作常常会改变数据的索引。进行链式操作时，要留意索引的状态，必要时使用 `reset_index()` 将索引还原为列，以保证后续操作的正确性。\n",
    "4.  **调试长链式的技巧**：如果一个很长的链式操作报错，可以**采用“分段执行”的方法**来定位问题。即从链式的开头逐步执行一小段，查看每一步返回的对象类型和内容是否符合预期，这是调试复杂链式操作的有效策略。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 初始化\n",
    "\n",
    "<br>\n",
    "\n",
    "该 `Notebook` 版本为**纯习题版**\n",
    "\n",
    "如果需要答案或者提示，可以微信搜索公众号「早起Python」获取！"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 加载数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.read_csv(\"某招聘网站数据.csv\",parse_dates=['createTime'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 分组"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1 - 分组统计｜均值\n",
    "\n",
    "计算各区(`district`)的薪资(`salary`)均值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "85434bf6-3e2c-4f93-a56e-c29544aecddf",
       "rows": [
        [
         "上城区",
         "26250.0"
        ],
        [
         "下沙",
         "30000.0"
        ],
        [
         "余杭区",
         "33583.33"
        ],
        [
         "拱墅区",
         "28500.0"
        ],
        [
         "江干区",
         "25250.0"
        ],
        [
         "滨江区",
         "31428.57"
        ],
        [
         "萧山区",
         "36250.0"
        ],
        [
         "西湖区",
         "30893.94"
        ]
       ],
       "shape": {
        "columns": 1,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>district</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>上城区</th>\n",
       "      <td>26250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>下沙</th>\n",
       "      <td>30000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>余杭区</th>\n",
       "      <td>33583.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>拱墅区</th>\n",
       "      <td>28500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>江干区</th>\n",
       "      <td>25250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>滨江区</th>\n",
       "      <td>31428.57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>萧山区</th>\n",
       "      <td>36250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西湖区</th>\n",
       "      <td>30893.94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            salary\n",
       "district          \n",
       "上城区       26250.00\n",
       "下沙        30000.00\n",
       "余杭区       33583.33\n",
       "拱墅区       28500.00\n",
       "江干区       25250.00\n",
       "滨江区       31428.57\n",
       "萧山区       36250.00\n",
       "西湖区       30893.94"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('district').agg({\n",
    "    'salary':'mean'\n",
    "}).round(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2 - 分组统计｜取消索引\n",
    "\n",
    "重新按照上一题要求进行分组，但不使用 `district` 做为索引"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "9643b9c9-31d8-49c2-816f-a9b052daf218",
       "rows": [
        [
         "0",
         "上城区",
         "26250.0"
        ],
        [
         "1",
         "下沙",
         "30000.0"
        ],
        [
         "2",
         "余杭区",
         "33583.33"
        ],
        [
         "3",
         "拱墅区",
         "28500.0"
        ],
        [
         "4",
         "江干区",
         "25250.0"
        ],
        [
         "5",
         "滨江区",
         "31428.57"
        ],
        [
         "6",
         "萧山区",
         "36250.0"
        ],
        [
         "7",
         "西湖区",
         "30893.94"
        ]
       ],
       "shape": {
        "columns": 2,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>district</th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>上城区</td>\n",
       "      <td>26250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>下沙</td>\n",
       "      <td>30000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>33583.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>28500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>江干区</td>\n",
       "      <td>25250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>31428.57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>36250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>30893.94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  district    salary\n",
       "0      上城区  26250.00\n",
       "1       下沙  30000.00\n",
       "2      余杭区  33583.33\n",
       "3      拱墅区  28500.00\n",
       "4      江干区  25250.00\n",
       "5      滨江区  31428.57\n",
       "6      萧山区  36250.00\n",
       "7      西湖区  30893.94"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('district').agg({\n",
    "    'salary':'mean'\n",
    "}).round(2).reset_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3 - 分组统计｜排序\n",
    "\n",
    "计算并提取平均薪资最高的区"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "b4b80c12-06fe-47c6-9b57-175695357bc9",
       "rows": [
        [
         "6",
         "萧山区",
         "36250.0"
        ]
       ],
       "shape": {
        "columns": 2,
        "rows": 1
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>district</th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>36250.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  district   salary\n",
       "6      萧山区  36250.0"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ndf=df.groupby('district').agg({\n",
    "    'salary':'mean'\n",
    "}).reset_index()\n",
    "# ndf.loc[ndf['salary'].idxmax()]\n",
    "ndf.loc[ndf['salary']==ndf['salary'].max()]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4 - 分组统计｜频率\n",
    "\n",
    "计算不同行政区(`district`)，不同规模公司(`companySize`)出现的次数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "companySize",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "count",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "a09c8f7d-d303-4def-b1a6-1e6e4caccf57",
       "rows": [
        [
         "0",
         "上城区",
         "50-150人",
         "2"
        ],
        [
         "1",
         "下沙",
         "150-500人",
         "1"
        ],
        [
         "2",
         "余杭区",
         "150-500人",
         "13"
        ],
        [
         "3",
         "余杭区",
         "2000人以上",
         "14"
        ],
        [
         "4",
         "余杭区",
         "50-150人",
         "7"
        ],
        [
         "5",
         "余杭区",
         "500-2000人",
         "2"
        ],
        [
         "6",
         "拱墅区",
         "2000人以上",
         "1"
        ],
        [
         "7",
         "拱墅区",
         "50-150人",
         "1"
        ],
        [
         "8",
         "拱墅区",
         "500-2000人",
         "2"
        ],
        [
         "9",
         "江干区",
         "2000人以上",
         "2"
        ],
        [
         "10",
         "江干区",
         "500-2000人",
         "2"
        ],
        [
         "11",
         "滨江区",
         "150-500人",
         "14"
        ],
        [
         "12",
         "滨江区",
         "2000人以上",
         "6"
        ],
        [
         "13",
         "滨江区",
         "500-2000人",
         "1"
        ],
        [
         "14",
         "萧山区",
         "50-150人",
         "1"
        ],
        [
         "15",
         "萧山区",
         "500-2000人",
         "3"
        ],
        [
         "16",
         "西湖区",
         "15-50人",
         "1"
        ],
        [
         "17",
         "西湖区",
         "150-500人",
         "7"
        ],
        [
         "18",
         "西湖区",
         "2000人以上",
         "11"
        ],
        [
         "19",
         "西湖区",
         "50-150人",
         "5"
        ],
        [
         "20",
         "西湖区",
         "500-2000人",
         "9"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 21
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>district</th>\n",
       "      <th>companySize</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>上城区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>下沙</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>江干区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>江干区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>15-50人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   district companySize  count\n",
       "0       上城区     50-150人      2\n",
       "1        下沙    150-500人      1\n",
       "2       余杭区    150-500人     13\n",
       "3       余杭区     2000人以上     14\n",
       "4       余杭区     50-150人      7\n",
       "5       余杭区   500-2000人      2\n",
       "6       拱墅区     2000人以上      1\n",
       "7       拱墅区     50-150人      1\n",
       "8       拱墅区   500-2000人      2\n",
       "9       江干区     2000人以上      2\n",
       "10      江干区   500-2000人      2\n",
       "11      滨江区    150-500人     14\n",
       "12      滨江区     2000人以上      6\n",
       "13      滨江区   500-2000人      1\n",
       "14      萧山区     50-150人      1\n",
       "15      萧山区   500-2000人      3\n",
       "16      西湖区      15-50人      1\n",
       "17      西湖区    150-500人      7\n",
       "18      西湖区     2000人以上     11\n",
       "19      西湖区     50-150人      5\n",
       "20      西湖区   500-2000人      9"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by=['district','companySize']).size().reset_index(name='count')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5 - 分组统计｜修改索引名\n",
    "\n",
    "将上一题的索引名修改为\n",
    "- district -> 行政区\n",
    "- companySize -> 公司规模"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "行政区",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "公司规模",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "count",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "7794f02e-48b4-4d89-87f6-3fe8f57998a1",
       "rows": [
        [
         "0",
         "上城区",
         "50-150人",
         "2"
        ],
        [
         "1",
         "下沙",
         "150-500人",
         "1"
        ],
        [
         "2",
         "余杭区",
         "150-500人",
         "13"
        ],
        [
         "3",
         "余杭区",
         "2000人以上",
         "14"
        ],
        [
         "4",
         "余杭区",
         "50-150人",
         "7"
        ],
        [
         "5",
         "余杭区",
         "500-2000人",
         "2"
        ],
        [
         "6",
         "拱墅区",
         "2000人以上",
         "1"
        ],
        [
         "7",
         "拱墅区",
         "50-150人",
         "1"
        ],
        [
         "8",
         "拱墅区",
         "500-2000人",
         "2"
        ],
        [
         "9",
         "江干区",
         "2000人以上",
         "2"
        ],
        [
         "10",
         "江干区",
         "500-2000人",
         "2"
        ],
        [
         "11",
         "滨江区",
         "150-500人",
         "14"
        ],
        [
         "12",
         "滨江区",
         "2000人以上",
         "6"
        ],
        [
         "13",
         "滨江区",
         "500-2000人",
         "1"
        ],
        [
         "14",
         "萧山区",
         "50-150人",
         "1"
        ],
        [
         "15",
         "萧山区",
         "500-2000人",
         "3"
        ],
        [
         "16",
         "西湖区",
         "15-50人",
         "1"
        ],
        [
         "17",
         "西湖区",
         "150-500人",
         "7"
        ],
        [
         "18",
         "西湖区",
         "2000人以上",
         "11"
        ],
        [
         "19",
         "西湖区",
         "50-150人",
         "5"
        ],
        [
         "20",
         "西湖区",
         "500-2000人",
         "9"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 21
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>行政区</th>\n",
       "      <th>公司规模</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>上城区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>下沙</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>江干区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>江干区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>15-50人</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>150-500人</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>11</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    行政区       公司规模  count\n",
       "0   上城区    50-150人      2\n",
       "1    下沙   150-500人      1\n",
       "2   余杭区   150-500人     13\n",
       "3   余杭区    2000人以上     14\n",
       "4   余杭区    50-150人      7\n",
       "5   余杭区  500-2000人      2\n",
       "6   拱墅区    2000人以上      1\n",
       "7   拱墅区    50-150人      1\n",
       "8   拱墅区  500-2000人      2\n",
       "9   江干区    2000人以上      2\n",
       "10  江干区  500-2000人      2\n",
       "11  滨江区   150-500人     14\n",
       "12  滨江区    2000人以上      6\n",
       "13  滨江区  500-2000人      1\n",
       "14  萧山区    50-150人      1\n",
       "15  萧山区  500-2000人      3\n",
       "16  西湖区     15-50人      1\n",
       "17  西湖区   150-500人      7\n",
       "18  西湖区    2000人以上     11\n",
       "19  西湖区    50-150人      5\n",
       "20  西湖区  500-2000人      9"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by=['district','companySize']).size().reset_index(name='count').rename(columns={'district':'行政区','companySize':'公司规模'})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6 - 分组统计｜计数\n",
    "\n",
    "计算上一题，每个区出现的公司数量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "count",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "3335aabe-fd3b-4d52-9b71-44975a3a49f6",
       "rows": [
        [
         "0",
         "上城区",
         "2"
        ],
        [
         "1",
         "下沙",
         "1"
        ],
        [
         "2",
         "余杭区",
         "36"
        ],
        [
         "3",
         "拱墅区",
         "4"
        ],
        [
         "4",
         "江干区",
         "4"
        ],
        [
         "5",
         "滨江区",
         "21"
        ],
        [
         "6",
         "萧山区",
         "4"
        ],
        [
         "7",
         "西湖区",
         "33"
        ]
       ],
       "shape": {
        "columns": 2,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>district</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>上城区</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>下沙</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>江干区</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  district  count\n",
       "0      上城区      2\n",
       "1       下沙      1\n",
       "2      余杭区     36\n",
       "3      拱墅区      4\n",
       "4      江干区      4\n",
       "5      滨江区     21\n",
       "6      萧山区      4\n",
       "7      西湖区     33"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by='district').size().reset_index(name='count')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7 - 分组查看｜全部\n",
    "\n",
    "将数据按照 `district`、`salary` 进行分组，并查看各分组内容"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "('district', 'salary')",
         "rawType": "object",
         "type": "unknown"
        },
        {
         "name": "0",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "6278261f-e299-4445-a360-4e5913b8ac4c",
       "rows": [
        [
         "('上城区', 22500)",
         "1"
        ],
        [
         "('上城区', 30000)",
         "1"
        ],
        [
         "('下沙', 30000)",
         "1"
        ],
        [
         "('余杭区', 7500)",
         "1"
        ],
        [
         "('余杭区', 20000)",
         "2"
        ],
        [
         "('余杭区', 22500)",
         "2"
        ],
        [
         "('余杭区', 25000)",
         "1"
        ],
        [
         "('余杭区', 27500)",
         "2"
        ],
        [
         "('余杭区', 29000)",
         "1"
        ],
        [
         "('余杭区', 30000)",
         "13"
        ],
        [
         "('余杭区', 35000)",
         "1"
        ],
        [
         "('余杭区', 37500)",
         "5"
        ],
        [
         "('余杭区', 40000)",
         "2"
        ],
        [
         "('余杭区', 45000)",
         "1"
        ],
        [
         "('余杭区', 50000)",
         "3"
        ],
        [
         "('余杭区', 60000)",
         "2"
        ],
        [
         "('拱墅区', 24000)",
         "1"
        ],
        [
         "('拱墅区', 30000)",
         "3"
        ],
        [
         "('江干区', 3500)",
         "1"
        ],
        [
         "('江干区', 22500)",
         "1"
        ],
        [
         "('江干区', 30000)",
         "1"
        ],
        [
         "('江干区', 45000)",
         "1"
        ],
        [
         "('滨江区', 7500)",
         "1"
        ],
        [
         "('滨江区', 15000)",
         "1"
        ],
        [
         "('滨江区', 20000)",
         "2"
        ],
        [
         "('滨江区', 22500)",
         "1"
        ],
        [
         "('滨江区', 30000)",
         "7"
        ],
        [
         "('滨江区', 32500)",
         "1"
        ],
        [
         "('滨江区', 37500)",
         "4"
        ],
        [
         "('滨江区', 42500)",
         "1"
        ],
        [
         "('滨江区', 45000)",
         "2"
        ],
        [
         "('滨江区', 50000)",
         "1"
        ],
        [
         "('萧山区', 25000)",
         "1"
        ],
        [
         "('萧山区', 30000)",
         "1"
        ],
        [
         "('萧山区', 45000)",
         "2"
        ],
        [
         "('西湖区', 6500)",
         "1"
        ],
        [
         "('西湖区', 20000)",
         "1"
        ],
        [
         "('西湖区', 21500)",
         "1"
        ],
        [
         "('西湖区', 22500)",
         "2"
        ],
        [
         "('西湖区', 24000)",
         "1"
        ],
        [
         "('西湖区', 25000)",
         "1"
        ],
        [
         "('西湖区', 26500)",
         "1"
        ],
        [
         "('西湖区', 27000)",
         "1"
        ],
        [
         "('西湖区', 27500)",
         "4"
        ],
        [
         "('西湖区', 30000)",
         "7"
        ],
        [
         "('西湖区', 35000)",
         "1"
        ],
        [
         "('西湖区', 36500)",
         "1"
        ],
        [
         "('西湖区', 37500)",
         "7"
        ],
        [
         "('西湖区', 40000)",
         "2"
        ],
        [
         "('西湖区', 45000)",
         "2"
        ]
       ],
       "shape": {
        "columns": 1,
        "rows": 50
       }
      },
      "text/plain": [
       "district  salary\n",
       "上城区       22500      1\n",
       "          30000      1\n",
       "下沙        30000      1\n",
       "余杭区       7500       1\n",
       "          20000      2\n",
       "          22500      2\n",
       "          25000      1\n",
       "          27500      2\n",
       "          29000      1\n",
       "          30000     13\n",
       "          35000      1\n",
       "          37500      5\n",
       "          40000      2\n",
       "          45000      1\n",
       "          50000      3\n",
       "          60000      2\n",
       "拱墅区       24000      1\n",
       "          30000      3\n",
       "江干区       3500       1\n",
       "          22500      1\n",
       "          30000      1\n",
       "          45000      1\n",
       "滨江区       7500       1\n",
       "          15000      1\n",
       "          20000      2\n",
       "          22500      1\n",
       "          30000      7\n",
       "          32500      1\n",
       "          37500      4\n",
       "          42500      1\n",
       "          45000      2\n",
       "          50000      1\n",
       "萧山区       25000      1\n",
       "          30000      1\n",
       "          45000      2\n",
       "西湖区       6500       1\n",
       "          20000      1\n",
       "          21500      1\n",
       "          22500      2\n",
       "          24000      1\n",
       "          25000      1\n",
       "          26500      1\n",
       "          27000      1\n",
       "          27500      4\n",
       "          30000      7\n",
       "          35000      1\n",
       "          36500      1\n",
       "          37500      7\n",
       "          40000      2\n",
       "          45000      2\n",
       "dtype: int64"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by=['district','salary']).size()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 8 - 分组查看｜指定\n",
    "\n",
    "将数据按照 `district`、`salary` 进行分组，并查看西湖区薪资为 30000 的工作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "positionName",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "companySize",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "industryField",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "financeStage",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "companyLabelList",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "firstType",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "secondType",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "thirdType",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "createTime",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "workYear",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "jobNature",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "education",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "positionAdvantage",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "imState",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "score",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "matchScore",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "famousCompany",
         "rawType": "bool",
         "type": "boolean"
        }
       ],
       "ref": "1056420e-9bea-4b18-a62a-39c1738fb42b",
       "rows": [
        [
         "11",
         "大数据分析工程师(J11108)",
         "2000人以上",
         "移动互联网,企业服务",
         "上市公司",
         "['技能培训', '年底双薪', '带薪年假', '岗位晋升']",
         "开发|测试|运维类",
         "数据开发",
         "数据分析",
         "2020-03-16 09:25:00",
         "西湖区",
         "30000",
         "应届毕业生",
         "全职",
         "本科",
         "六险一金 带薪年假 年度体检 周末双休",
         "today",
         "17",
         "4.2450657",
         "False"
        ],
        [
         "27",
         "数据分析经理",
         "2000人以上",
         "硬件",
         "不需要融资",
         "['年终分红', '带薪年假', '年度旅游', '岗位晋升']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-16 11:24:00",
         "西湖区",
         "30000",
         "5-10年",
         "全职",
         "本科",
         "股票期权,千万级用户,试用期全薪",
         "today",
         "6",
         "1.1640825",
         "True"
        ],
        [
         "33",
         "数据分析师（社招）",
         "500-2000人",
         "移动互联网",
         "上市公司",
         "['绩效奖金', '股票期权', '年底双薪', '专项奖金']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-16 11:18:00",
         "西湖区",
         "30000",
         "应届毕业生",
         "全职",
         "不限",
         "16-18薪 大数据A股上市公司",
         "today",
         "15",
         "1.0203772",
         "True"
        ],
        [
         "34",
         "商业数据分析师",
         "50-150人",
         "移动互联网,企业服务",
         "B轮",
         "['定期体检', '帅哥多', '领导好', '美女多']",
         "市场|商务类",
         "市场|营销",
         "商业数据分析",
         "2020-03-16 11:13:00",
         "西湖区",
         "30000",
         "1-3年",
         "全职",
         "硕士",
         "发挥空间大,职业发展好",
         "today",
         "5",
         "1.0956326",
         "False"
        ],
        [
         "85",
         "高级数据分析师",
         "500-2000人",
         "移动互联网",
         "上市公司",
         "['包午餐晚餐', '奖金多多多', '零食下午茶', '全员出国游']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-14 21:28:00",
         "西湖区",
         "30000",
         "3-5年",
         "全职",
         "本科",
         "福利好，年轻有活力，行业前景好",
         "today",
         "2",
         "0.3895032",
         "False"
        ],
        [
         "88",
         "资深数据分析师",
         "500-2000人",
         "移动互联网",
         "A轮",
         "['岗位晋升', '年度旅游', '年底双薪', '午餐补助']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-15 19:43:00",
         "西湖区",
         "30000",
         "3-5年",
         "全职",
         "大专",
         "六险一金,餐饮补贴,双休,出国旅游",
         "today",
         "1",
         "0.5023706",
         "False"
        ],
        [
         "98",
         "数据分析建模工程师",
         "50-150人",
         "数据服务,信息安全",
         "A轮",
         "['午餐补助', '带薪年假', '16到18薪', '法定节假日']",
         "开发|测试|运维类",
         "人工智能",
         "机器学习",
         "2020-03-14 19:00:00",
         "西湖区",
         "30000",
         "1-3年",
         "全职",
         "本科",
         "海量数据 全链路建模实践 16-18薪",
         "threeDays",
         "0",
         "0.3563076",
         "False"
        ]
       ],
       "shape": {
        "columns": 19,
        "rows": 7
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>positionName</th>\n",
       "      <th>companySize</th>\n",
       "      <th>industryField</th>\n",
       "      <th>financeStage</th>\n",
       "      <th>companyLabelList</th>\n",
       "      <th>firstType</th>\n",
       "      <th>secondType</th>\n",
       "      <th>thirdType</th>\n",
       "      <th>createTime</th>\n",
       "      <th>district</th>\n",
       "      <th>salary</th>\n",
       "      <th>workYear</th>\n",
       "      <th>jobNature</th>\n",
       "      <th>education</th>\n",
       "      <th>positionAdvantage</th>\n",
       "      <th>imState</th>\n",
       "      <th>score</th>\n",
       "      <th>matchScore</th>\n",
       "      <th>famousCompany</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>大数据分析工程师(J11108)</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>移动互联网,企业服务</td>\n",
       "      <td>上市公司</td>\n",
       "      <td>['技能培训', '年底双薪', '带薪年假', '岗位晋升']</td>\n",
       "      <td>开发|测试|运维类</td>\n",
       "      <td>数据开发</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 09:25:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>应届毕业生</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>六险一金 带薪年假 年度体检 周末双休</td>\n",
       "      <td>today</td>\n",
       "      <td>17</td>\n",
       "      <td>4.245066</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>数据分析经理</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>硬件</td>\n",
       "      <td>不需要融资</td>\n",
       "      <td>['年终分红', '带薪年假', '年度旅游', '岗位晋升']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 11:24:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>股票期权,千万级用户,试用期全薪</td>\n",
       "      <td>today</td>\n",
       "      <td>6</td>\n",
       "      <td>1.164082</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>数据分析师（社招）</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>移动互联网</td>\n",
       "      <td>上市公司</td>\n",
       "      <td>['绩效奖金', '股票期权', '年底双薪', '专项奖金']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 11:18:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>应届毕业生</td>\n",
       "      <td>全职</td>\n",
       "      <td>不限</td>\n",
       "      <td>16-18薪 大数据A股上市公司</td>\n",
       "      <td>today</td>\n",
       "      <td>15</td>\n",
       "      <td>1.020377</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>34</th>\n",
       "      <td>商业数据分析师</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>移动互联网,企业服务</td>\n",
       "      <td>B轮</td>\n",
       "      <td>['定期体检', '帅哥多', '领导好', '美女多']</td>\n",
       "      <td>市场|商务类</td>\n",
       "      <td>市场|营销</td>\n",
       "      <td>商业数据分析</td>\n",
       "      <td>2020-03-16 11:13:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>硕士</td>\n",
       "      <td>发挥空间大,职业发展好</td>\n",
       "      <td>today</td>\n",
       "      <td>5</td>\n",
       "      <td>1.095633</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>85</th>\n",
       "      <td>高级数据分析师</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>移动互联网</td>\n",
       "      <td>上市公司</td>\n",
       "      <td>['包午餐晚餐', '奖金多多多', '零食下午茶', '全员出国游']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-14 21:28:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>福利好，年轻有活力，行业前景好</td>\n",
       "      <td>today</td>\n",
       "      <td>2</td>\n",
       "      <td>0.389503</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>88</th>\n",
       "      <td>资深数据分析师</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>移动互联网</td>\n",
       "      <td>A轮</td>\n",
       "      <td>['岗位晋升', '年度旅游', '年底双薪', '午餐补助']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-15 19:43:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>全职</td>\n",
       "      <td>大专</td>\n",
       "      <td>六险一金,餐饮补贴,双休,出国旅游</td>\n",
       "      <td>today</td>\n",
       "      <td>1</td>\n",
       "      <td>0.502371</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>98</th>\n",
       "      <td>数据分析建模工程师</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>数据服务,信息安全</td>\n",
       "      <td>A轮</td>\n",
       "      <td>['午餐补助', '带薪年假', '16到18薪', '法定节假日']</td>\n",
       "      <td>开发|测试|运维类</td>\n",
       "      <td>人工智能</td>\n",
       "      <td>机器学习</td>\n",
       "      <td>2020-03-14 19:00:00</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>30000</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>海量数据 全链路建模实践 16-18薪</td>\n",
       "      <td>threeDays</td>\n",
       "      <td>0</td>\n",
       "      <td>0.356308</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        positionName companySize industryField financeStage  \\\n",
       "11  大数据分析工程师(J11108)     2000人以上    移动互联网,企业服务         上市公司   \n",
       "27            数据分析经理     2000人以上            硬件        不需要融资   \n",
       "33         数据分析师（社招）   500-2000人         移动互联网         上市公司   \n",
       "34           商业数据分析师     50-150人    移动互联网,企业服务           B轮   \n",
       "85           高级数据分析师   500-2000人         移动互联网         上市公司   \n",
       "88           资深数据分析师   500-2000人         移动互联网           A轮   \n",
       "98         数据分析建模工程师     50-150人     数据服务,信息安全           A轮   \n",
       "\n",
       "                        companyLabelList  firstType secondType thirdType  \\\n",
       "11      ['技能培训', '年底双薪', '带薪年假', '岗位晋升']  开发|测试|运维类       数据开发      数据分析   \n",
       "27      ['年终分红', '带薪年假', '年度旅游', '岗位晋升']  产品|需求|项目类       数据分析      数据分析   \n",
       "33      ['绩效奖金', '股票期权', '年底双薪', '专项奖金']  产品|需求|项目类       数据分析      数据分析   \n",
       "34         ['定期体检', '帅哥多', '领导好', '美女多']     市场|商务类      市场|营销    商业数据分析   \n",
       "85  ['包午餐晚餐', '奖金多多多', '零食下午茶', '全员出国游']  产品|需求|项目类       数据分析      数据分析   \n",
       "88      ['岗位晋升', '年度旅游', '年底双薪', '午餐补助']  产品|需求|项目类       数据分析      数据分析   \n",
       "98   ['午餐补助', '带薪年假', '16到18薪', '法定节假日']  开发|测试|运维类       人工智能      机器学习   \n",
       "\n",
       "            createTime district  salary workYear jobNature education  \\\n",
       "11 2020-03-16 09:25:00      西湖区   30000    应届毕业生        全职        本科   \n",
       "27 2020-03-16 11:24:00      西湖区   30000    5-10年        全职        本科   \n",
       "33 2020-03-16 11:18:00      西湖区   30000    应届毕业生        全职        不限   \n",
       "34 2020-03-16 11:13:00      西湖区   30000     1-3年        全职        硕士   \n",
       "85 2020-03-14 21:28:00      西湖区   30000     3-5年        全职        本科   \n",
       "88 2020-03-15 19:43:00      西湖区   30000     3-5年        全职        大专   \n",
       "98 2020-03-14 19:00:00      西湖区   30000     1-3年        全职        本科   \n",
       "\n",
       "      positionAdvantage    imState  score  matchScore  famousCompany  \n",
       "11  六险一金 带薪年假 年度体检 周末双休      today     17    4.245066          False  \n",
       "27     股票期权,千万级用户,试用期全薪      today      6    1.164082           True  \n",
       "33     16-18薪 大数据A股上市公司      today     15    1.020377           True  \n",
       "34          发挥空间大,职业发展好      today      5    1.095633          False  \n",
       "85      福利好，年轻有活力，行业前景好      today      2    0.389503          False  \n",
       "88    六险一金,餐饮补贴,双休,出国旅游      today      1    0.502371          False  \n",
       "98  海量数据 全链路建模实践 16-18薪  threeDays      0    0.356308          False  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by=['district','salary']).get_group(('西湖区',30000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "问题点出了`groupby`操作的一个关键特性：**分组不是为了筛选掉数据，而是为了给数据建立一个结构化的“索引”，以便我们能快速定位到特定的子集。**\n",
    "\n",
    "你无法直接看到分组后的所有明细数据，是因为`df.groupby()`返回的是一个`DataFrameGroupBy`对象，它更像一个“分组地图”，而不是一个新的DataFrame。要查看特定组的明细数据，需要使用正确的方法。下面这个表格总结了核心的查看方法：\n",
    "\n",
    "| 方法 | 用途 | 适用于你的场景的代码示例 |\n",
    "| :--- | :--- | :--- |\n",
    "| **`get_group()`** | **精准查看单个分组**的全部明细数据。 | `df.groupby(['district', 'salary']).get_group(('西湖区', 30000))`  |\n",
    "| **循环遍历** | 逐一查看所有分组，适用于调试或批量处理。 | `for (district, salary), group_data in df.groupby(['district', 'salary']): ...`  |\n",
    "| **`groups`属性** | 查看分组结构，即每个组对应原数据的行索引。 | `df.groupby(['district', 'salary']).groups`  |\n",
    "\n",
    "为了加深理解，你可以把`DataFrameGroupBy`对象想象成一个字典：\n",
    "- **键（Key）**：就是每个分组的名称，例如 `('西湖区', 30000)`。\n",
    "- **值（Value）**：就是属于这个分组的所有行组成的DataFrame。\n",
    "\n",
    "除了查看明细，分组后通常还会进行一些聚合计算，但你仍然可以同时保留明细数据的特点：\n",
    "\n",
    "- 使用 `head()` 查看每个分组的前几行：`df.groupby(['district', 'salary']).head(2)` 会返回一个新的DataFrame，其中包含每个分组的前两行记录。\n",
    "- 使用 `apply()` 进行更灵活的操作：你可以编写自定义函数，对每个分组进行任意处理，然后返回结果。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 9 - 分组规则｜通过匿名函数1\n",
    "\n",
    "根据 createTime 列，计算每天不同 行政区 新增的岗位数量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "createTime",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "新增岗位数量",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "c772c2cf-6791-421e-b7fe-012576c60ae3",
       "rows": [
        [
         "0",
         "2020-03-09 00:00:00",
         "余杭区",
         "1"
        ],
        [
         "1",
         "2020-03-10 00:00:00",
         "拱墅区",
         "1"
        ],
        [
         "2",
         "2020-03-11 00:00:00",
         "萧山区",
         "1"
        ],
        [
         "3",
         "2020-03-11 00:00:00",
         "西湖区",
         "1"
        ],
        [
         "4",
         "2020-03-12 00:00:00",
         "上城区",
         "1"
        ],
        [
         "5",
         "2020-03-13 00:00:00",
         "西湖区",
         "1"
        ],
        [
         "6",
         "2020-03-14 00:00:00",
         "余杭区",
         "1"
        ],
        [
         "7",
         "2020-03-14 00:00:00",
         "拱墅区",
         "1"
        ],
        [
         "8",
         "2020-03-14 00:00:00",
         "滨江区",
         "1"
        ],
        [
         "9",
         "2020-03-14 00:00:00",
         "西湖区",
         "3"
        ],
        [
         "10",
         "2020-03-15 00:00:00",
         "余杭区",
         "6"
        ],
        [
         "11",
         "2020-03-15 00:00:00",
         "滨江区",
         "2"
        ],
        [
         "12",
         "2020-03-15 00:00:00",
         "西湖区",
         "1"
        ],
        [
         "13",
         "2020-03-16 00:00:00",
         "上城区",
         "1"
        ],
        [
         "14",
         "2020-03-16 00:00:00",
         "下沙",
         "1"
        ],
        [
         "15",
         "2020-03-16 00:00:00",
         "余杭区",
         "28"
        ],
        [
         "16",
         "2020-03-16 00:00:00",
         "拱墅区",
         "2"
        ],
        [
         "17",
         "2020-03-16 00:00:00",
         "江干区",
         "4"
        ],
        [
         "18",
         "2020-03-16 00:00:00",
         "滨江区",
         "18"
        ],
        [
         "19",
         "2020-03-16 00:00:00",
         "萧山区",
         "3"
        ],
        [
         "20",
         "2020-03-16 00:00:00",
         "西湖区",
         "27"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 21
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>createTime</th>\n",
       "      <th>district</th>\n",
       "      <th>新增岗位数量</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2020-03-09</td>\n",
       "      <td>余杭区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2020-03-10</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2020-03-11</td>\n",
       "      <td>萧山区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2020-03-11</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2020-03-12</td>\n",
       "      <td>上城区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>2020-03-13</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>2020-03-14</td>\n",
       "      <td>余杭区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>2020-03-14</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>2020-03-14</td>\n",
       "      <td>滨江区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>2020-03-14</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>2020-03-15</td>\n",
       "      <td>余杭区</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>2020-03-15</td>\n",
       "      <td>滨江区</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>2020-03-15</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>上城区</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>下沙</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>余杭区</td>\n",
       "      <td>28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>江干区</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>滨江区</td>\n",
       "      <td>18</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>萧山区</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>2020-03-16</td>\n",
       "      <td>西湖区</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   createTime district  新增岗位数量\n",
       "0  2020-03-09      余杭区       1\n",
       "1  2020-03-10      拱墅区       1\n",
       "2  2020-03-11      萧山区       1\n",
       "3  2020-03-11      西湖区       1\n",
       "4  2020-03-12      上城区       1\n",
       "5  2020-03-13      西湖区       1\n",
       "6  2020-03-14      余杭区       1\n",
       "7  2020-03-14      拱墅区       1\n",
       "8  2020-03-14      滨江区       1\n",
       "9  2020-03-14      西湖区       3\n",
       "10 2020-03-15      余杭区       6\n",
       "11 2020-03-15      滨江区       2\n",
       "12 2020-03-15      西湖区       1\n",
       "13 2020-03-16      上城区       1\n",
       "14 2020-03-16       下沙       1\n",
       "15 2020-03-16      余杭区      28\n",
       "16 2020-03-16      拱墅区       2\n",
       "17 2020-03-16      江干区       4\n",
       "18 2020-03-16      滨江区      18\n",
       "19 2020-03-16      萧山区       3\n",
       "20 2020-03-16      西湖区      27"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by=[pd.Grouper(key='createTime',freq='D'),'district']).size().reset_index(name='新增岗位数量')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "微信搜索公众号「早起Python」，关注后可以获得更多资源！"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 10 - 分组规则｜通过匿名函数2\n",
    "\n",
    "计算各行政区的企业领域（industryField）包含电商的总数\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "count",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "4ebbf0c3-4d37-44cd-84f0-2ef9ab9807bd",
       "rows": [
        [
         "0",
         "上城区",
         "0"
        ],
        [
         "1",
         "下沙",
         "1"
        ],
        [
         "2",
         "余杭区",
         "9"
        ],
        [
         "3",
         "拱墅区",
         "0"
        ],
        [
         "4",
         "江干区",
         "2"
        ],
        [
         "5",
         "滨江区",
         "9"
        ],
        [
         "6",
         "萧山区",
         "3"
        ],
        [
         "7",
         "西湖区",
         "4"
        ]
       ],
       "shape": {
        "columns": 2,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>district</th>\n",
       "      <th>count</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>上城区</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>下沙</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>余杭区</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>拱墅区</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>江干区</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>滨江区</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>萧山区</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>西湖区</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  district  count\n",
       "0      上城区      0\n",
       "1       下沙      1\n",
       "2      余杭区      9\n",
       "3      拱墅区      0\n",
       "4      江干区      2\n",
       "5      滨江区      9\n",
       "6      萧山区      3\n",
       "7      西湖区      4"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('district')['industryField'].apply(lambda x:x.str.contains('电商',na=False).sum()).reset_index(name='count')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "代码步骤详解\n",
    "\n",
    "| 步骤 | 返回值类型 | 作用与解释 |\n",
    "| :--- | :--- | :--- |\n",
    "| 1. `df.groupby('district')` | `DataFrameGroupBy` 对象 | **分组**：根据 'district' 列的不同值（如北京、上海）将原始DataFrame拆分成多个小组（group）。 |\n",
    "| 2. `['industryField']` | `SeriesGroupBy` 对象 | **选取列**：从上一步的分组对象中，只选取我们需要进行操作的 'industryField' 这一列。此时，每个小组内只剩下一个Series 。 |\n",
    "| 3. `.apply(lambda x: ... )` | `pandas.Series` | **应用函数**：对**每个小组**的 'industryField' Series（在lambda函数中名为 `x`）应用自定义函数。这里的lambda函数逻辑是：1) `x.str.contains('电商')`：检查该小组Series的每个字符串元素是否包含子串“电商”，返回一个布尔值的Series（True/False）；2) `.sum()`：对该布尔Series求和，在Python中`True`被视为1，`False`被视为0，因此求和结果就是该小组内包含“电商”的记录总数 。 |\n",
    "| 4. `.reset_index(name='count')` | `pandas.DataFrame` | **重置索引并命名**：上一步apply返回的Series，其索引是分组键（即各个地区名），值是统计结果。`reset_index()` 将这个索引（地区名）转换回一个普通的列，同时为值的列命名为 `count`，最终返回一个规整的两列DataFrame 。 |\n",
    "\n",
    "核心逻辑与设计哲学\n",
    "\n",
    "这行代码的强大之处在于它完美遵循了Pandas“拆分-应用-合并”的策略 ：\n",
    "1.  **拆分 (Split)**：`groupby` 将数据按关键字段分割。\n",
    "2.  **应用 (Apply)**：`apply` 函数让你能对每个小组自由地进行任何操作（这里是字符串匹配和求和）。\n",
    "3.  **合并 (Combine)**：Pandas自动将各小组的计算结果组装起来，最后通过 `reset_index` 整理成易于阅读的表格形式。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![公众号：早起Python](http://liuzaoqi.oss-cn-beijing.aliyuncs.com/2021/09/18/16319660121648.jpg?域名/sample.jpg?x-oss-process=style/stylename)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 11 - 分组规则｜通过内置函数\n",
    "\n",
    "通过 positionName 的长度进行分组，并计算不同长度岗位名称的薪资均值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "positionName",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "salary",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "fad8b374-6ca2-425f-9211-e509a8a0baab",
       "rows": [
        [
         "4",
         "30125.0"
        ],
        [
         "5",
         "34083.33"
        ],
        [
         "6",
         "32954.55"
        ],
        [
         "7",
         "29816.67"
        ],
        [
         "8",
         "31875.0"
        ],
        [
         "9",
         "29375.0"
        ],
        [
         "10",
         "30000.0"
        ],
        [
         "11",
         "34166.67"
        ],
        [
         "12",
         "29583.33"
        ],
        [
         "13",
         "38833.33"
        ],
        [
         "14",
         "40000.0"
        ],
        [
         "15",
         "26000.0"
        ],
        [
         "16",
         "28750.0"
        ],
        [
         "17",
         "40000.0"
        ],
        [
         "18",
         "25750.0"
        ],
        [
         "19",
         "45000.0"
        ],
        [
         "21",
         "21500.0"
        ],
        [
         "23",
         "60000.0"
        ]
       ],
       "shape": {
        "columns": 1,
        "rows": 18
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>positionName</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>30125.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>34083.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>32954.55</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>29816.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>31875.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>29375.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>30000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>34166.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>29583.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>38833.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>40000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>26000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>28750.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>40000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>25750.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>45000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>21500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>60000.00</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                salary\n",
       "positionName          \n",
       "4             30125.00\n",
       "5             34083.33\n",
       "6             32954.55\n",
       "7             29816.67\n",
       "8             31875.00\n",
       "9             29375.00\n",
       "10            30000.00\n",
       "11            34166.67\n",
       "12            29583.33\n",
       "13            38833.33\n",
       "14            40000.00\n",
       "15            26000.00\n",
       "16            28750.00\n",
       "17            40000.00\n",
       "18            25750.00\n",
       "19            45000.00\n",
       "21            21500.00\n",
       "23            60000.00"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(df['positionName'].str.len()).agg({\n",
    "    'salary':'mean'\n",
    "}).round(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面这个表格汇总了`by`参数的主要类型和用法，之后我们会探讨一些关键细节。\n",
    "\n",
    "| 分组键类型 | 描述/作用 | 简单示例 |\n",
    "| :--- | :--- | :--- |\n",
    "| **列名（字符串）** | 最常用形式，按指定列的唯一值分组。 | `df.groupby('team')` |\n",
    "| **列名列表** | 按多个列进行层次分组。 | `df.groupby(['team', 'department'])` |\n",
    "| **Pandas Series** | 用于分组的Series，其长度需与DataFrame相同。 | `df.groupby(df['team'])` |\n",
    "| **NumPy数组/Python列表** | 外部提供的与DataFrame行等长的分组依据。 | `df.groupby([1,1,2,2,3])` |\n",
    "| **函数（Function）** | 将函数应用于索引，返回值作为分组标签。 | `df.groupby(lambda x: x % 2)` |\n",
    "| **字典（Dictionary）** | 提供“索引标签 -> 组名”的映射。 | `df.groupby({0:'A', 1:'A', 2:'B'})` |\n",
    "| **`pd.Grouper`对象** | 专门用于按时间频率分组的强大工具。 | `df.groupby(pd.Grouper(key='date', freq='M'))` |\n",
    "| **索引级别（`level`）** | 当索引为MultiIndex时，按特定索引级别分组。 | `df.groupby(level=0)` |\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 12 - 分组规则｜通过字典\n",
    "\n",
    "将 score 和 matchScore 的和记为总分，与 salary 列同时进行分组，并查看结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "total_score",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "salary",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "0",
         "rawType": "int64",
         "type": "integer"
        }
       ],
       "ref": "54c7e688-69da-4b69-a79a-2635dcca0406",
       "rows": [
        [
         "0",
         "0.16",
         "21500",
         "1"
        ],
        [
         "1",
         "0.26",
         "30000",
         "1"
        ],
        [
         "2",
         "0.28",
         "20000",
         "1"
        ],
        [
         "3",
         "0.28",
         "35000",
         "1"
        ],
        [
         "4",
         "0.31",
         "25000",
         "1"
        ],
        [
         "5",
         "0.34",
         "36500",
         "1"
        ],
        [
         "6",
         "0.36",
         "30000",
         "1"
        ],
        [
         "7",
         "1.36",
         "37500",
         "1"
        ],
        [
         "8",
         "1.39",
         "42500",
         "1"
        ],
        [
         "9",
         "1.41",
         "30000",
         "1"
        ],
        [
         "10",
         "1.44",
         "30000",
         "1"
        ],
        [
         "11",
         "1.46",
         "30000",
         "1"
        ],
        [
         "12",
         "1.5",
         "29000",
         "1"
        ],
        [
         "13",
         "1.5",
         "30000",
         "1"
        ],
        [
         "14",
         "1.54",
         "50000",
         "1"
        ],
        [
         "15",
         "1.83",
         "30000",
         "1"
        ],
        [
         "16",
         "2.39",
         "30000",
         "1"
        ],
        [
         "17",
         "2.51",
         "40000",
         "1"
        ],
        [
         "18",
         "2.57",
         "30000",
         "1"
        ],
        [
         "19",
         "2.82",
         "7500",
         "1"
        ],
        [
         "20",
         "2.82",
         "30000",
         "1"
        ],
        [
         "21",
         "3.59",
         "60000",
         "1"
        ],
        [
         "22",
         "3.64",
         "22500",
         "1"
        ],
        [
         "23",
         "3.69",
         "7500",
         "1"
        ],
        [
         "24",
         "3.73",
         "30000",
         "1"
        ],
        [
         "25",
         "3.73",
         "37500",
         "1"
        ],
        [
         "26",
         "3.87",
         "30000",
         "1"
        ],
        [
         "27",
         "3.9",
         "26500",
         "1"
        ],
        [
         "28",
         "3.9",
         "37500",
         "1"
        ],
        [
         "29",
         "4.46",
         "30000",
         "1"
        ],
        [
         "30",
         "4.77",
         "30000",
         "1"
        ],
        [
         "31",
         "4.82",
         "6500",
         "1"
        ],
        [
         "32",
         "4.82",
         "24000",
         "1"
        ],
        [
         "33",
         "4.83",
         "22500",
         "1"
        ],
        [
         "34",
         "4.83",
         "30000",
         "3"
        ],
        [
         "35",
         "4.83",
         "45000",
         "1"
        ],
        [
         "36",
         "4.85",
         "45000",
         "1"
        ],
        [
         "37",
         "4.86",
         "25000",
         "1"
        ],
        [
         "38",
         "4.86",
         "27500",
         "1"
        ],
        [
         "39",
         "4.86",
         "30000",
         "1"
        ],
        [
         "40",
         "4.86",
         "40000",
         "1"
        ],
        [
         "41",
         "4.91",
         "25000",
         "1"
        ],
        [
         "42",
         "4.91",
         "30000",
         "2"
        ],
        [
         "43",
         "5.02",
         "27000",
         "1"
        ],
        [
         "44",
         "5.08",
         "50000",
         "1"
        ],
        [
         "45",
         "5.11",
         "30000",
         "1"
        ],
        [
         "46",
         "5.16",
         "37500",
         "1"
        ],
        [
         "47",
         "5.92",
         "20000",
         "1"
        ],
        [
         "48",
         "5.94",
         "22500",
         "1"
        ],
        [
         "49",
         "5.95",
         "27500",
         "2"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 100
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>total_score</th>\n",
       "      <th>salary</th>\n",
       "      <th>0</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.16</td>\n",
       "      <td>21500</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.26</td>\n",
       "      <td>30000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.28</td>\n",
       "      <td>20000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.28</td>\n",
       "      <td>35000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.31</td>\n",
       "      <td>25000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>95</th>\n",
       "      <td>78.76</td>\n",
       "      <td>30000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>96</th>\n",
       "      <td>80.87</td>\n",
       "      <td>45000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>97</th>\n",
       "      <td>94.97</td>\n",
       "      <td>3500</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>98</th>\n",
       "      <td>208.56</td>\n",
       "      <td>15000</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>99</th>\n",
       "      <td>248.10</td>\n",
       "      <td>37500</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>100 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "    total_score  salary  0\n",
       "0          0.16   21500  1\n",
       "1          0.26   30000  1\n",
       "2          0.28   20000  1\n",
       "3          0.28   35000  1\n",
       "4          0.31   25000  1\n",
       "..          ...     ... ..\n",
       "95        78.76   30000  1\n",
       "96        80.87   45000  1\n",
       "97        94.97    3500  1\n",
       "98       208.56   15000  1\n",
       "99       248.10   37500  1\n",
       "\n",
       "[100 rows x 3 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['total_score']=(df['score'].astype(float)+df['matchScore'].astype(float)).round(2)\n",
    "df.groupby(['total_score','salary']).size().reset_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 13 - 分组规则｜通过多列\n",
    "\n",
    "计算不同 工作年限（`workYear`）和 学历（`education`）之间的薪资均值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "workYear",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "education",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "1f21bc36-5979-4deb-951f-5ee7cd56039f",
       "rows": [
        [
         "0",
         "1-3年",
         "不限",
         "36250.0"
        ],
        [
         "1",
         "1-3年",
         "本科",
         "31000.0"
        ],
        [
         "2",
         "1-3年",
         "硕士",
         "36875.0"
        ],
        [
         "3",
         "3-5年",
         "不限",
         "30312.5"
        ],
        [
         "4",
         "3-5年",
         "大专",
         "28125.0"
        ],
        [
         "5",
         "3-5年",
         "本科",
         "31828.12"
        ],
        [
         "6",
         "5-10年",
         "不限",
         "26250.0"
        ],
        [
         "7",
         "5-10年",
         "本科",
         "28423.08"
        ],
        [
         "8",
         "不限",
         "不限",
         "35000.0"
        ],
        [
         "9",
         "不限",
         "本科",
         "35625.0"
        ],
        [
         "10",
         "应届毕业生",
         "不限",
         "32500.0"
        ],
        [
         "11",
         "应届毕业生",
         "本科",
         "33833.33"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 12
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>workYear</th>\n",
       "      <th>education</th>\n",
       "      <th>salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1-3年</td>\n",
       "      <td>不限</td>\n",
       "      <td>36250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1-3年</td>\n",
       "      <td>本科</td>\n",
       "      <td>31000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1-3年</td>\n",
       "      <td>硕士</td>\n",
       "      <td>36875.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3-5年</td>\n",
       "      <td>不限</td>\n",
       "      <td>30312.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>3-5年</td>\n",
       "      <td>大专</td>\n",
       "      <td>28125.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>3-5年</td>\n",
       "      <td>本科</td>\n",
       "      <td>31828.12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>5-10年</td>\n",
       "      <td>不限</td>\n",
       "      <td>26250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>5-10年</td>\n",
       "      <td>本科</td>\n",
       "      <td>28423.08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>不限</td>\n",
       "      <td>不限</td>\n",
       "      <td>35000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>不限</td>\n",
       "      <td>本科</td>\n",
       "      <td>35625.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>应届毕业生</td>\n",
       "      <td>不限</td>\n",
       "      <td>32500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>应届毕业生</td>\n",
       "      <td>本科</td>\n",
       "      <td>33833.33</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   workYear education    salary\n",
       "0      1-3年        不限  36250.00\n",
       "1      1-3年        本科  31000.00\n",
       "2      1-3年        硕士  36875.00\n",
       "3      3-5年        不限  30312.50\n",
       "4      3-5年        大专  28125.00\n",
       "5      3-5年        本科  31828.12\n",
       "6     5-10年        不限  26250.00\n",
       "7     5-10年        本科  28423.08\n",
       "8        不限        不限  35000.00\n",
       "9        不限        本科  35625.00\n",
       "10    应届毕业生        不限  32500.00\n",
       "11    应届毕业生        本科  33833.33"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(['workYear','education']).agg({'salary':'mean'}).reset_index().round(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 14 - 分组转换｜ transform\n",
    "\n",
    "在原数据框 df 新增一列，数值为该区的平均薪资水平"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['district_avg_salary']=df.groupby('district')['salary'].transform('mean').round(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面这个表格快速总结了 `transform` 方法的核心信息。\n",
    "\n",
    "| 项目 | 描述 |\n",
    "| :--- | :--- |\n",
    "| **主要作用** | 对数据进行转换，并保持返回结果的形状（行数和索引）与原始数据一致。 |\n",
    "| **可应用对象** | `DataFrame` 和 `Series`。通常与 `groupby()` 操作结合使用。 |\n",
    "| **常用参数** | `func` (用于转换的函数), `axis` (应用函数的轴), `*args`, `**kwargs`。 |\n",
    "| **返回值** | 一个与原始对象形状相同的 `DataFrame` 或 `Series`。 |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 15 - 分组过滤｜filter\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "提取平均工资小于 30000 的行政区的全部数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "index",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "positionName",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "companySize",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "industryField",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "financeStage",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "companyLabelList",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "firstType",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "secondType",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "thirdType",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "createTime",
         "rawType": "datetime64[ns]",
         "type": "datetime"
        },
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "workYear",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "jobNature",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "education",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "positionAdvantage",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "imState",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "score",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "matchScore",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "famousCompany",
         "rawType": "bool",
         "type": "boolean"
        },
        {
         "name": "total_score",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "district_avg_salary",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "6de7ff67-df38-4265-81ff-6fb27e907d0f",
       "rows": [
        [
         "2",
         "数据分析",
         "2000人以上",
         "移动互联网,企业服务",
         "上市公司",
         "['节日礼物', '年底双薪', '股票期权', '带薪年假']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-16 10:33:00",
         "江干区",
         "3500",
         "1-3年",
         "全职",
         "本科",
         "五险一金 周末双休 不加班 节日福利",
         "today",
         "80",
         "14.972357",
         "False",
         "94.97",
         "25250.0"
        ],
        [
         "3",
         "数据分析",
         "500-2000人",
         "电商",
         "D轮及以上",
         "['生日趴', '每月腐败基金', '每月补贴', '年度旅游']",
         "开发|测试|运维类",
         "数据开发",
         "数据分析",
         "2020-03-16 10:10:00",
         "江干区",
         "45000",
         "3-5年",
         "全职",
         "本科",
         "年终奖等",
         "threeDays",
         "68",
         "12.874153",
         "True",
         "80.87",
         "25250.0"
        ],
        [
         "45",
         "金融数据分析师",
         "500-2000人",
         "电商",
         "D轮及以上",
         "['生日趴', '每月腐败基金', '每月补贴', '年度旅游']",
         "开发|测试|运维类",
         "数据开发",
         "数据分析",
         "2020-03-16 10:36:00",
         "江干区",
         "22500",
         "3-5年",
         "全职",
         "本科",
         "平台大,机会多,重点项目",
         "today",
         "5",
         "0.99588984",
         "True",
         "6.0",
         "25250.0"
        ],
        [
         "54",
         "数据分析专家",
         "50-150人",
         "移动互联网,消费生活",
         "未融资",
         "['年底双薪', '专项奖金', '美女多', '弹性工作']",
         "产品|需求|项目类",
         "高端产品职位",
         "数据分析专家",
         "2020-03-16 09:38:00",
         "拱墅区",
         "30000",
         "5-10年",
         "全职",
         "本科",
         "领导NICE",
         "today",
         "5",
         "0.96269345",
         "False",
         "5.96",
         "28500.0"
        ],
        [
         "72",
         "BI数据分析师",
         "500-2000人",
         "移动互联网,金融",
         "B轮",
         "['弹性工作', '扁平管理', '领导好', '五险一金']",
         "产品|需求|项目类",
         "数据分析",
         "BI",
         "2020-03-16 09:46:00",
         "拱墅区",
         "24000",
         "5-10年",
         "全职",
         "本科",
         "带薪年假 / 五险一金 / 节假日福利",
         "threeDays",
         "4",
         "0.8210544",
         "False",
         "4.82",
         "28500.0"
        ],
        [
         "73",
         "数据分析师",
         "2000人以上",
         "消费生活,硬件",
         "上市公司",
         "['定期体检', '五险一金', '专项奖金', '骨干家庭公寓']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-16 08:07:00",
         "江干区",
         "30000",
         "3-5年",
         "全职",
         "本科",
         "大平台免费住宿，免费班车，出国游学等",
         "today",
         "4",
         "0.82769364",
         "True",
         "4.83",
         "25250.0"
        ],
        [
         "81",
         "商业数据分析师（阿里数据银行）",
         "50-150人",
         "移动互联网,广告营销",
         "天使轮",
         "['节日礼物', '带薪年假', '绩效奖金', '五险一金']",
         "市场|商务类",
         "市场|营销",
         "商业数据分析",
         "2020-03-16 09:09:00",
         "上城区",
         "22500",
         "1-3年",
         "全职",
         "本科",
         "五险一金 周末双休 福利丰厚 带薪年假",
         "today",
         "3",
         "0.6373689",
         "False",
         "3.64",
         "26250.0"
        ],
        [
         "89",
         "数据分析",
         "500-2000人",
         "其他",
         "未融资",
         "[]",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-10 11:16:00",
         "拱墅区",
         "30000",
         "1-3年",
         "全职",
         "本科",
         "数据分析",
         "threeDays",
         "1",
         "1.8240006",
         "False",
         "2.82",
         "28500.0"
        ],
        [
         "96",
         "数据分析专员",
         "2000人以上",
         "移动互联网,广告营销",
         "上市公司",
         "['节日礼物', '股票期权', '带薪年假', '岗位晋升']",
         "产品|需求|项目类",
         "数据分析",
         "数据分析",
         "2020-03-14 15:10:00",
         "拱墅区",
         "30000",
         "1-3年",
         "全职",
         "不限",
         "股票期权,绩效奖金,弹性工作,五险一金",
         "threeDays",
         "1",
         "0.46032283",
         "False",
         "1.46",
         "28500.0"
        ],
        [
         "97",
         "旅游大数据分析师（杭州）",
         "50-150人",
         "数据服务,企业服务",
         "A轮",
         "['年底双薪', '股票期权', '午餐补助', '定期体检']",
         "开发|测试|运维类",
         "数据开发",
         "数据治理",
         "2020-03-12 16:38:00",
         "上城区",
         "30000",
         "1-3年",
         "全职",
         "本科",
         "管理扁平 潜力项目 五险一金 周末双休",
         "sevenDays",
         "1",
         "0.8267557",
         "False",
         "1.83",
         "26250.0"
        ]
       ],
       "shape": {
        "columns": 21,
        "rows": 10
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>positionName</th>\n",
       "      <th>companySize</th>\n",
       "      <th>industryField</th>\n",
       "      <th>financeStage</th>\n",
       "      <th>companyLabelList</th>\n",
       "      <th>firstType</th>\n",
       "      <th>secondType</th>\n",
       "      <th>thirdType</th>\n",
       "      <th>createTime</th>\n",
       "      <th>district</th>\n",
       "      <th>...</th>\n",
       "      <th>workYear</th>\n",
       "      <th>jobNature</th>\n",
       "      <th>education</th>\n",
       "      <th>positionAdvantage</th>\n",
       "      <th>imState</th>\n",
       "      <th>score</th>\n",
       "      <th>matchScore</th>\n",
       "      <th>famousCompany</th>\n",
       "      <th>total_score</th>\n",
       "      <th>district_avg_salary</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>数据分析</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>移动互联网,企业服务</td>\n",
       "      <td>上市公司</td>\n",
       "      <td>['节日礼物', '年底双薪', '股票期权', '带薪年假']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 10:33:00</td>\n",
       "      <td>江干区</td>\n",
       "      <td>...</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>五险一金 周末双休 不加班 节日福利</td>\n",
       "      <td>today</td>\n",
       "      <td>80</td>\n",
       "      <td>14.972357</td>\n",
       "      <td>False</td>\n",
       "      <td>94.97</td>\n",
       "      <td>25250.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>数据分析</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>电商</td>\n",
       "      <td>D轮及以上</td>\n",
       "      <td>['生日趴', '每月腐败基金', '每月补贴', '年度旅游']</td>\n",
       "      <td>开发|测试|运维类</td>\n",
       "      <td>数据开发</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 10:10:00</td>\n",
       "      <td>江干区</td>\n",
       "      <td>...</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>年终奖等</td>\n",
       "      <td>threeDays</td>\n",
       "      <td>68</td>\n",
       "      <td>12.874153</td>\n",
       "      <td>True</td>\n",
       "      <td>80.87</td>\n",
       "      <td>25250.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>金融数据分析师</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>电商</td>\n",
       "      <td>D轮及以上</td>\n",
       "      <td>['生日趴', '每月腐败基金', '每月补贴', '年度旅游']</td>\n",
       "      <td>开发|测试|运维类</td>\n",
       "      <td>数据开发</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 10:36:00</td>\n",
       "      <td>江干区</td>\n",
       "      <td>...</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>平台大,机会多,重点项目</td>\n",
       "      <td>today</td>\n",
       "      <td>5</td>\n",
       "      <td>0.995890</td>\n",
       "      <td>True</td>\n",
       "      <td>6.00</td>\n",
       "      <td>25250.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>54</th>\n",
       "      <td>数据分析专家</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>移动互联网,消费生活</td>\n",
       "      <td>未融资</td>\n",
       "      <td>['年底双薪', '专项奖金', '美女多', '弹性工作']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>高端产品职位</td>\n",
       "      <td>数据分析专家</td>\n",
       "      <td>2020-03-16 09:38:00</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>...</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>领导NICE</td>\n",
       "      <td>today</td>\n",
       "      <td>5</td>\n",
       "      <td>0.962693</td>\n",
       "      <td>False</td>\n",
       "      <td>5.96</td>\n",
       "      <td>28500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>72</th>\n",
       "      <td>BI数据分析师</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>移动互联网,金融</td>\n",
       "      <td>B轮</td>\n",
       "      <td>['弹性工作', '扁平管理', '领导好', '五险一金']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>BI</td>\n",
       "      <td>2020-03-16 09:46:00</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>...</td>\n",
       "      <td>5-10年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>带薪年假 / 五险一金 / 节假日福利</td>\n",
       "      <td>threeDays</td>\n",
       "      <td>4</td>\n",
       "      <td>0.821054</td>\n",
       "      <td>False</td>\n",
       "      <td>4.82</td>\n",
       "      <td>28500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>73</th>\n",
       "      <td>数据分析师</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>消费生活,硬件</td>\n",
       "      <td>上市公司</td>\n",
       "      <td>['定期体检', '五险一金', '专项奖金', '骨干家庭公寓']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-16 08:07:00</td>\n",
       "      <td>江干区</td>\n",
       "      <td>...</td>\n",
       "      <td>3-5年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>大平台免费住宿，免费班车，出国游学等</td>\n",
       "      <td>today</td>\n",
       "      <td>4</td>\n",
       "      <td>0.827694</td>\n",
       "      <td>True</td>\n",
       "      <td>4.83</td>\n",
       "      <td>25250.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>81</th>\n",
       "      <td>商业数据分析师（阿里数据银行）</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>移动互联网,广告营销</td>\n",
       "      <td>天使轮</td>\n",
       "      <td>['节日礼物', '带薪年假', '绩效奖金', '五险一金']</td>\n",
       "      <td>市场|商务类</td>\n",
       "      <td>市场|营销</td>\n",
       "      <td>商业数据分析</td>\n",
       "      <td>2020-03-16 09:09:00</td>\n",
       "      <td>上城区</td>\n",
       "      <td>...</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>五险一金 周末双休 福利丰厚 带薪年假</td>\n",
       "      <td>today</td>\n",
       "      <td>3</td>\n",
       "      <td>0.637369</td>\n",
       "      <td>False</td>\n",
       "      <td>3.64</td>\n",
       "      <td>26250.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>89</th>\n",
       "      <td>数据分析</td>\n",
       "      <td>500-2000人</td>\n",
       "      <td>其他</td>\n",
       "      <td>未融资</td>\n",
       "      <td>[]</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-10 11:16:00</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>...</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>threeDays</td>\n",
       "      <td>1</td>\n",
       "      <td>1.824001</td>\n",
       "      <td>False</td>\n",
       "      <td>2.82</td>\n",
       "      <td>28500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>96</th>\n",
       "      <td>数据分析专员</td>\n",
       "      <td>2000人以上</td>\n",
       "      <td>移动互联网,广告营销</td>\n",
       "      <td>上市公司</td>\n",
       "      <td>['节日礼物', '股票期权', '带薪年假', '岗位晋升']</td>\n",
       "      <td>产品|需求|项目类</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>数据分析</td>\n",
       "      <td>2020-03-14 15:10:00</td>\n",
       "      <td>拱墅区</td>\n",
       "      <td>...</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>不限</td>\n",
       "      <td>股票期权,绩效奖金,弹性工作,五险一金</td>\n",
       "      <td>threeDays</td>\n",
       "      <td>1</td>\n",
       "      <td>0.460323</td>\n",
       "      <td>False</td>\n",
       "      <td>1.46</td>\n",
       "      <td>28500.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>97</th>\n",
       "      <td>旅游大数据分析师（杭州）</td>\n",
       "      <td>50-150人</td>\n",
       "      <td>数据服务,企业服务</td>\n",
       "      <td>A轮</td>\n",
       "      <td>['年底双薪', '股票期权', '午餐补助', '定期体检']</td>\n",
       "      <td>开发|测试|运维类</td>\n",
       "      <td>数据开发</td>\n",
       "      <td>数据治理</td>\n",
       "      <td>2020-03-12 16:38:00</td>\n",
       "      <td>上城区</td>\n",
       "      <td>...</td>\n",
       "      <td>1-3年</td>\n",
       "      <td>全职</td>\n",
       "      <td>本科</td>\n",
       "      <td>管理扁平 潜力项目 五险一金 周末双休</td>\n",
       "      <td>sevenDays</td>\n",
       "      <td>1</td>\n",
       "      <td>0.826756</td>\n",
       "      <td>False</td>\n",
       "      <td>1.83</td>\n",
       "      <td>26250.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10 rows × 21 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       positionName companySize industryField financeStage  \\\n",
       "2              数据分析     2000人以上    移动互联网,企业服务         上市公司   \n",
       "3              数据分析   500-2000人            电商        D轮及以上   \n",
       "45          金融数据分析师   500-2000人            电商        D轮及以上   \n",
       "54           数据分析专家     50-150人    移动互联网,消费生活          未融资   \n",
       "72          BI数据分析师   500-2000人      移动互联网,金融           B轮   \n",
       "73            数据分析师     2000人以上       消费生活,硬件         上市公司   \n",
       "81  商业数据分析师（阿里数据银行）     50-150人    移动互联网,广告营销          天使轮   \n",
       "89             数据分析   500-2000人            其他          未融资   \n",
       "96           数据分析专员     2000人以上    移动互联网,广告营销         上市公司   \n",
       "97     旅游大数据分析师（杭州）     50-150人     数据服务,企业服务           A轮   \n",
       "\n",
       "                      companyLabelList  firstType secondType thirdType  \\\n",
       "2     ['节日礼物', '年底双薪', '股票期权', '带薪年假']  产品|需求|项目类       数据分析      数据分析   \n",
       "3    ['生日趴', '每月腐败基金', '每月补贴', '年度旅游']  开发|测试|运维类       数据开发      数据分析   \n",
       "45   ['生日趴', '每月腐败基金', '每月补贴', '年度旅游']  开发|测试|运维类       数据开发      数据分析   \n",
       "54     ['年底双薪', '专项奖金', '美女多', '弹性工作']  产品|需求|项目类     高端产品职位    数据分析专家   \n",
       "72     ['弹性工作', '扁平管理', '领导好', '五险一金']  产品|需求|项目类       数据分析        BI   \n",
       "73  ['定期体检', '五险一金', '专项奖金', '骨干家庭公寓']  产品|需求|项目类       数据分析      数据分析   \n",
       "81    ['节日礼物', '带薪年假', '绩效奖金', '五险一金']     市场|商务类      市场|营销    商业数据分析   \n",
       "89                                  []  产品|需求|项目类       数据分析      数据分析   \n",
       "96    ['节日礼物', '股票期权', '带薪年假', '岗位晋升']  产品|需求|项目类       数据分析      数据分析   \n",
       "97    ['年底双薪', '股票期权', '午餐补助', '定期体检']  开发|测试|运维类       数据开发      数据治理   \n",
       "\n",
       "            createTime district  ...  workYear jobNature education  \\\n",
       "2  2020-03-16 10:33:00      江干区  ...      1-3年        全职        本科   \n",
       "3  2020-03-16 10:10:00      江干区  ...      3-5年        全职        本科   \n",
       "45 2020-03-16 10:36:00      江干区  ...      3-5年        全职        本科   \n",
       "54 2020-03-16 09:38:00      拱墅区  ...     5-10年        全职        本科   \n",
       "72 2020-03-16 09:46:00      拱墅区  ...     5-10年        全职        本科   \n",
       "73 2020-03-16 08:07:00      江干区  ...      3-5年        全职        本科   \n",
       "81 2020-03-16 09:09:00      上城区  ...      1-3年        全职        本科   \n",
       "89 2020-03-10 11:16:00      拱墅区  ...      1-3年        全职        本科   \n",
       "96 2020-03-14 15:10:00      拱墅区  ...      1-3年        全职        不限   \n",
       "97 2020-03-12 16:38:00      上城区  ...      1-3年        全职        本科   \n",
       "\n",
       "      positionAdvantage    imState score  matchScore  famousCompany  \\\n",
       "2    五险一金 周末双休 不加班 节日福利      today    80   14.972357          False   \n",
       "3                  年终奖等  threeDays    68   12.874153           True   \n",
       "45         平台大,机会多,重点项目      today     5    0.995890           True   \n",
       "54               领导NICE      today     5    0.962693          False   \n",
       "72  带薪年假 / 五险一金 / 节假日福利  threeDays     4    0.821054          False   \n",
       "73   大平台免费住宿，免费班车，出国游学等      today     4    0.827694           True   \n",
       "81  五险一金 周末双休 福利丰厚 带薪年假      today     3    0.637369          False   \n",
       "89                 数据分析  threeDays     1    1.824001          False   \n",
       "96  股票期权,绩效奖金,弹性工作,五险一金  threeDays     1    0.460323          False   \n",
       "97  管理扁平 潜力项目 五险一金 周末双休  sevenDays     1    0.826756          False   \n",
       "\n",
       "    total_score  district_avg_salary  \n",
       "2         94.97              25250.0  \n",
       "3         80.87              25250.0  \n",
       "45         6.00              25250.0  \n",
       "54         5.96              28500.0  \n",
       "72         4.82              28500.0  \n",
       "73         4.83              25250.0  \n",
       "81         3.64              26250.0  \n",
       "89         2.82              28500.0  \n",
       "96         1.46              28500.0  \n",
       "97         1.83              26250.0  \n",
       "\n",
       "[10 rows x 21 columns]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df.groupby('district')['salary'].transform('mean') < 30000]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "理解Pandas中`groupby`后的`transform`、`agg`和`apply`的区别，关键在于把握它们各自的**输入、处理和输出**方式。下面这个表格清晰地概括了它们的核心差异。\n",
    "\n",
    "| 特性 | `transform` | `agg` (聚合) | `apply` (应用) |\n",
    "| :--- | :--- | :--- | :--- |\n",
    "| **核心功能** | 组内转换，保持原状 | 聚合压缩，生成摘要 | 灵活应用，无所不能 |\n",
    "| **输出形状** | ✅ **与原始数据相同** | ❌ **压缩后的摘要** (行数=组数) | ⚠️ **高度灵活** (任意形状) |\n",
    "| **典型用例** | 为每行数据添加组内统计值（如组内排名、填充缺失值） | 计算各组的统计指标（如总和、平均值、最大值） | 实现复杂逻辑（如组内排序、提取Top N记录） |\n",
    "| **性能** | 较快（通常使用内置优化） | 很快（高度优化） | 相对较慢（灵活性带来的开销） |\n",
    "\n",
    "核心机制详解\n",
    "\n",
    "要理解上述区别，需要先明白`groupby`的工作原理：它将一个DataFrame按照某个或某几个键（key）分割成若干个独立的子DataFrame（即“组”）。后续操作都是基于这些分组进行的。\n",
    "\n",
    "- **`transform`：广播艺术**\n",
    "    它的目标是**保持原数据形状不变**。它会对每个分组应用一个函数（如 `'mean'`），但关键步骤在于，它会将这个组的计算结果（一个标量值）“广播”回该组的每一行。因此，最终输出的结果长度与原始数据完全一致。这非常适合为原始数据添加新的分组统计列。\n",
    "\n",
    "- **`agg`：摘要大师**\n",
    "    它的目标是**对每个分组进行聚合计算，并压缩结果**。每个分组经过函数（如 `'sum'`, `'mean'`）计算后，只产生一个单一的汇总值。最终，输出结果的行数等于唯一组的数量。这是获取标准统计指标（如各部门的销售总额）最高效的方法。\n",
    "\n",
    "- **`apply`：万能工具**\n",
    "    它是三者中**最灵活**的。它可以将整个分组（一个子DataFrame）传递给一个自定义函数，并且对这个函数能返回什么几乎没有限制——可以是一个标量、一个Series，甚至是另一个DataFrame。这种灵活性使其能够处理`agg`和`transform`无法胜任的复杂逻辑，但相应的，性能开销也最大。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 16 - 分组可视化\n",
    "\n",
    "<br>\n",
    "\n",
    "对杭州市各区公司数量进行分组，并使用柱状图进行可视化"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAABKUAAAMWCAYAAAAgRDUeAAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjcsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvTLEjVAAAAAlwSFlzAAAPYQAAD2EBqD+naQAAc51JREFUeJzs3XeYHWX9Pv57N1uSkJ4AoURDDUVCIDRRikIA6UUUAogiYAFESkQUFVR6k+aHKgqKhY5SpIkgIEoRQkIQkmCA0EIqSbZlz+8Pfjlfl9TFZM4mvF7XtRecmTlz3mf2vXM29z7zTFWpVCoFAAAAAApUXekCAAAAAPjoEUoBAAAAUDihFAAAAACFE0oBAAAAUDihFAAAAACFE0oBAAAAUDihFAAAAACFE0oBAAAAUDihFADwkdbc3FzpEgAAPpKEUgBAu51xxhn5y1/+Ms/yO++8M0ceeeQin//SSy/l8MMPz4wZMxa4zZgxY/L5z38+U6dOXei+mpqasv/+++fPf/7zIl93fvbZZ58cffTR7XrOuHHjMmbMmHZ9vfbaawvc36xZszJz5szy49tuuy0nnXTSQmtoaWlJQ0NDWltb21X7f2toaEiSvPfeeznvvPMyadKkNDU1lYO6f//73xk5cuSH3v+H8d/Hadq0abnkkkvS2NjY7v0cfPDBOf7445dkaQDAEiaUAgDa7e677864cePmWb7mmmvmF7/4Re67776FPn/KlCm55pprUlNTU1728MMP59VXXy0/njFjRm6++eZFhi51dXXp379/hg8fnokTJ7brfUyaNCn33ntvdt1113Y9b+edd87QoUPz6U9/epFfn/rUp7LFFlssNPi69tprs/baa+e9995L8n5Idf755+f1119f4HPuueeedOnSJZ06dUpVVdVife2///7l57e2tmabbbbJr3/968ycOTMjRozIpEmTct1112XYsGFJkt/85jfZaaedMmfOnIUej7XXXjsf//jHM2TIkAV+9ejRI6eddtpC9/P2229n0KBBueaaa5IkjY2NOemkk3Ldddct9HkfNHXq1Nx8883p2rVru54HABSrZtGbAAC0VV9fn06dOiVJ5syZk+bm5nTu3Dnrr79+fvSjH6W2tra8bVNTU2pqalJd/f/+FlZbW5uqqqp06dKlvOzII4/MiBEj8tWvfrW8TfJ+6DTX2LFj06lTp9TU1LQJtE444YT8+9//ztixY9u8Tmtra7m+Nddcc573cfbZZ6e5uTm77bbbfN/n5ptvnscff7z8Xufq3LlzTj755JxyyimLPliLUCqV8n//93/Zfffd061btyTJ/vvvnxEjRuSiiy7KOeecM9/nfeYzn8nLL7+curq6Nsd7QXbZZZd07ty5/Li6ujrnnntu9ttvv/zzn/9MktTU1ORnP/tZ+X09+eST2WeffeZ5/x/UpUuXfPKTn8zQoUMXuM0VV1zR5vs9PxdeeGFqamqy9957J0lWWmmlHH300fnBD36QffbZJ/369ZvnOWPHjk1LS0tqa2vL3/vrrrsuzc3N2WuvvfLKK6+02b5UKqWlpSUtLS1Zf/31F1oPALB0CaUAgMXS2tqaRx55JPX19ZkxY0bGjRuXRx55JNOnT8/uu+++0Oc+88wzGTJkSPlxVVVVm/AoeT/o+u8Aan6GDh2aadOmJck8QcmcOXPywAMPzLMsSXr27DnPZYAvvfRSLrvssgwdOjTrrrtufv7zn5fX3XXXXfnyl7+cq6++er6BzAeXvfHGG1lllVXm2e7SSy/NRhttlO22226B7+m6667Lv//97/zpT38qL6utrc0pp5yS73znOznhhBOy8sort3nO3NFjH/vYxxYaSDU3N2fOnDnp3Llzampq2tQ9evToPPvss/nWt76VO+64I0ny+9//Pptuumlef/31jB07Ng8//HAuv/zy8rErlUppbW1N375953mt7t27zzc0mmtR39tXX301l112Wb7zne+02f/3v//9/OY3v8mRRx6Zm2++OVVVVW2e97WvfW2e7/tcW2yxxQJf7+Mf//g8gRUAUCyhFACwWFpaWvK1r30t9fX1eeGFFzJu3LhcffXV+eMf/5ixY8emd+/e8wQGyfuXYPXp02eR+597idnCvPbaa6mpqUl9fX2bbf/1r39lk002yZNPPtkm/EreD6bmzp00V0NDQw444IAccMABOe2007L++uvna1/7WrbbbruMGzcuxxxzTH70ox9l8ODBi6z7mWeeyTbbbJNjjz02p512WnkE1+uvv54RI0Zk6NCh+fOf/5wVVlhhnudOmzYtJ510Ur797W9n4MCBbdYdccQRufjiizNixIh5Ll8bN25c1llnnUXWliQHHXRQfv3rX8+zfOLEibnrrrtSU1OTu+++O0ny6KOPpnPnznnooYdSVVWV9957LwcffHCb5+2www65//772yxrbm7O3Xffnccee2yBdUyYMGGhc0MdccQR6dOnT0444YQ2y3v27Jlrrrkmu+66a4477rhceOGFbb73f/zjH8uj56qrq3Pqqafm5z//ecaMGZO33347v/vd7/LDH/5wnhAUAOgASgAA7TB16tRSp06dStdee227n3vjjTeWDjvssNIzzzxT6tSpU+nRRx8tnX322aVSqVTaeOONS9dee23p9NNPL82cObP0zDPPlJKUZsyYscj9zt32mWeeWaw6DjnkkNLAgQNLU6ZMKZVKpdI555xTWm211UqPPfZYaeDAgaWDDjpooc/feOONSz/5yU9KpVKp1NLSUjr99NNLdXV1pU9+8pOlN954o9Ta2lrabbfdSgMGDChNnjx5gfv50pe+VFp11VUX+B4ffvjhUqdOnUpXXXVVm+VNTU2ll19+ufTaa6+V3nnnnQV+vfbaa6U333yzVCqVSkOHDi0deuihbfbT0tJSOuyww0qf+MQnSklK5557bvm1DjvssNKnP/3p8rZHHnlk6Ygjjig1NjaWSqVSac6cOaXp06eXWlpaFnqsPqipqak0c+bMNssuvvjiUnV1dekvf/nLAp93xRVXlKqrq0v77rvvAo/pCy+8UKqvry9df/31pVKpVLrrrrtK1dXV7aoPACiOkVIAQLvcd999mTNnTsaPH5/nn38+n/jEJ5K8P6Jl+vTpbbZ94YUXst5665UfP/7445kwYUL58dSpU3PWWWeV75JWU1OT3/72t+nbt2+23HLL+b7+o48+mk9/+tPzXbfJJpvMs+z555/Phhtu2GbZD37wg0yfPj29evVKkpx44on54x//mK233jo77rhjfvWrXy3iKPw/nTp1yve+973svvvu2WeffbL55ptnzz33zF133ZX7778/vXv3nu/zrrzyylx33XW58847y3NJHXvssTnggAPyyU9+MkmyzTbb5Ec/+lG+9rWvpVu3bjnggAOSvH9531prrbXYNc7Piy++mCOPPDKzZs3KrbfemnXWWSdrr712DjvssNTU1OTmm2/Oxz72sfL2b7zxRjbeeOPyZXivvvpq1l133dTW1qZbt27l97Aw7733XmbMmJEDDzwwV199dZL37zR43HHH5Wtf+1r69OmTF154Yb4j5rbddtucf/75+d73vpcNNtggv/zlL7PzzjuX10+fPj377LNPtt9++2y//fZ57bXXMnXq1NTU1Mxz58MePXqkR48eH+q4AQBLjlAKAGiXyy+/PMn7d4z74x//mCeffDLV1dWpq6vLbbfdlu222y4zZ87M6quvPs/dz8aPH1+eXLqqqirDhg1LdXV1HnroofI2xx13XM4///z85je/me/rz52s+4477ihfwjZmzJjss88+ufXWW8sh2D/+8Y8ceuihbSb3nuu/L3175plncvrpp+fJJ5/MrrvumnvvvTeHHXZYvvrVr2abbbZZ5CWFcw0ePDj//Oc/s8suu+TnP/95dt9993z2s5+d77Y333xzjjrqqBx77LHlO/+98MILufjiizNgwIByKJW8P6fSCy+8kOHDh+eZZ57J2WefXV53xBFHlMOd+TnllFPyk5/8ZL7rHnvssay77rq5+OKL88gjj+SLX/xitthii9x222255ZZb0rlz54wZMyYNDQ3p3Llz3njjjfIE5Mn7czLNvRzvoosuSqlUWuQxWn/99dsESW+//XYOP/zwDB8+PKuuumo23njjhT7/61//eh577LEce+yx2XTTTcvLZ82alX322SdjxozJmDFjMmDAgDbP++DjU089NT/60Y8WWS8AsHS5uB4AWGyPP/54/vGPf2TNNdfMSSedlHfeeSdXXnllef2KK66YXr16pWfPnkkyzzw+Tz/9dDbccMO899576dGjR2pra7PTTjvlxhtvLG9z4IEH5tVXX51n3qK55k7WvcYaa2S99dbLeuutV56PaeDAgeVlc0f51NfXz7OPiRMn5rLLLssnP/nJbL311unTp0/GjBmTO++8M3//+98zc+bMfPazn02/fv2y2267ZcSIEXniiScWeXzeeOONvPTSS1l11VXzpz/9Kaeeeuo825RKpfz617/OLrvskvPOO6+8/Gc/+1nWWWedHHvssW22r66uzvXXX5/DDjusTSiUJH379s1mm22WKVOmzPO16qqrzjeQm+tzn/tcTjzxxLz99tv5yle+klVXXTWtra1Zc80106tXr5x77rlZZZVV8thjj6W1tTUvvvjiPCPO5jrppJMyZcqUDBw4MAMHDsz06dPz05/+tPx44MCBufnmm9t8n5P37673z3/+M7/4xS/yne98J01NTRkzZkyS5O9//3tKpVL5a80118yAAQMyZMiQ/PWvf82KK66YJJk0aVJ22mmnjBo1KnvuuWf23HPPNDc3p7m5OX/+859TX19fftzc3Jwtt9xyvj0BABTPSCkAYLHMmTMnxx13XL75zW/m/vvvzworrJBzzjknG2ywQZL3JzRf2CVckydPzsSJE7PRRhvlP//5TzlU+M53vpNOnTqVQ58uXbrk17/+9XwnBv9fvfjii9l7770zZsyYrL/++vnsZz+bSy65JP369Utra2teeeWV9O3bN+edd16+853v5P7778/TTz+da665Jt/+9rcXuu+XXnopn/vc5/KZz3wmv/3tbzNixIicdtppmTx5ci6++OLydlVVVfnd736XOXPmlCdFf+211/LLX/4yN9xwQ/luenMvkVx77bXTqVOn+Y6ImntHvbmXIf63qqqq+d45cK7zzz8/F110UaqqqtLS0pIrr7wyV155ZWbPnp2vf/3rOfXUU/PII4/k1ltvTd++fdPS0tJmdNJ/q6ury1VXXVUeGTd79uxMnTo1J554Ynmbt956a76h1hprrNHm8ahRo1JdXV3uq7kmTpyY1Vdfvc2yGTNmZIsttsisWbNy//3355e//GVefvnl8nGdG4rOfTzXBx8DAJXhExkAWCx/+9vfMm7cuNxzzz3lUUwHHnhgkvdDiPfeey+rrbbaAp/fp0+fzJo1K9XV1XnwwQfLd8n74N3ykmTPPffMv/71ryX9FjJo0KB897vfzZZbbpkVVlghH/vYx3LZZZctcPtPfvKT5ZFCC7t727333ptDDjkkW2+9dX7/+9+ntrY2F198cbp27Zqzzz47q6yySk4++eTy9h8cqXPaaadlk002yX777ddm2fXXX59Ro0bNcxnk4lrYJXXnnntuzj333Oy3335Zc801c+6552bGjBlZbbXV8sUvfjFJcvDBB5dHHm2//fblwOyDGhoa8sUvfjGDBg1K8n5A9/vf/z5f//rXy9v85je/SUtLyyJrvu2227LZZpule/fu5WXvvvtuGhoa5rkMr3v37rnqqqsyYMCArLvuuovcNwDQsQilAIDFst122+Wpp56a76ic5557Lv3790/fvn0Xuo+5I1RuvfXWHH744R+qjsWZuyh5f56hJPOMFjr00EOTvD9yJ0lGjhxZnqz9v5166qn561//mmTeyxDneu+99/LVr341N954Y4499ticf/75bbY966yz8sYbb2TcuHELrPOZZ57JtddemwcffLDN8m9/+9u5+uqrc9ppp7WZR+q/lUqlPPHEEwuc96qpqWmBr5sk//rXv3Lrrbfm0EMPzR133JGHH344gwcPzrbbbpvk/cnF11577VxxxRW54447Frif66+/vs33ZerUqenUqVObkU0nnXRS+TLLhdXzu9/9Lpdeemmb5a+//nqSeeeGSpIddtih/P+tra0L3f/ibgMAFEMoBQAstrmhwAeDoauvvjp77bXXPNuXSqW89NJLee6558qjgEaOHJkxY8aUR1m116JG20yYMCHbb7993nrrrfTv3z/9+/ef73YLGvXz3xZ2+VuSdOvWLQceeGD23Xff8uiiD7riiisW+FozZszIl7/85ey///7Zdttt09TUlDfffDOvvfZaXnvttWy22Wa54IILcsghh8w3OJs5c2Y222yz3HffffOs+8EPftDmErj5hTEbb7xxnnrqqdx333054YQT8vLLL2ejjTbKTTfdlM9//vOZPXt2eV6qhoaG+b6Hyy67LKeffnq6dOlSDsdmz56dadOm5ZRTTilv19zcnKamprzxxhvz3c+zzz6b3XbbLUOHDs1Xv/rVNutefPHFJJnn8r0Pmjvx+vzccsstefzxx/Pcc8/lqKOOWuh+AIBiCKUAgHZramoqj8L5xS9+kdtvvz3PPvtsef3cEOb222/PxIkTc/PNN5dDqRNOOCGHHXbYPCOuWlpaMmfOnPLjuSHKB8OU7t27Z7/99itPpv7f25RKpXzsYx/LsGHD0rNnzxx55JGLvHvexIkT5zsX1tSpUxc4KqulpSUNDQ2ZOnVqtt9++/L2CzJ79uw0NTWlb9++5dFi06dPz6c+9ak8//zzmTp1avr165fJkyenVCqlc+fOWXPNNbPWWmtl9dVXz9FHH93mDoVzbbHFFhk4cOB8R69dcsklmTZtWu644468/fbbefHFF7P55pu32aaqqiobbLBBHnzwwbz99tu5+uqrM2XKlNxxxx3ZcMMNM3z48PTo0SMnnXRShg8fntmzZ+dLX/pSm30ccsghOfDAA9OrV6/yKLH7778/X/7yl/Pyyy+Xt5szZ05mz549T53vvfderrjiivz4xz/OwIEDc/vtt5fDwDvuuCOPPvpobrzxxmy88cYLnbg9mTeUam1tLX8PZ86cmZ///OfZeeed55kwHgCoDKEUANBuc0Op+++/PyeeeGL+9Kc/ZZVVVimvr6+vz6GHHprzzz8/3bp1K9+F7qKLLsro0aNz0003zbPPhoaGNqNx5gYYs2fPTo8ePcrL11tvvXmePzeUmhtqXXHFFYv1HpJk5513XuA2n/rUp+a7vKGhIaeffnpOP/30Rb7Of3vhhRey3nrrJUl69OiR1VZbLd27d89mm22WDTbYIOuss07WWWedDBgwoBym3Xnnndl9991z3333ZdiwYW32N3z48IW+Xrdu3cojoNZdd90cccQR5XWjRo3Kz3/+8/zhD3/IxhtvnMceeywbbrhhZs2ale9973sZMmRIDjrooPzf//1f6uvr09TUlEMPPTT/+Mc/ypfX7brrrvnHP/6Rnj17tgn/mpubM3v27Ky99tpt6pkzZ06mTZuWM844ozzf1GWXXZbvf//7Oeqoo3L66ae3CQh79+6dX/ziF9l6661z1llnLfL4fnAy+Dlz5pRH1g0fPjwHH3zwIkNKAKA4VaXFnZgBAGA+Jk+enD59+iz2tjNmzMjHP/7xJVrD3/72t2yzzTb529/+tsAg6YPmzJmTV199NautttpiXcr3v2htbU1TU1Pq6urazDlVKpUWKyS54447sscee3yoQGX06NHp2bPnPJPQz5o1K9/97nczfPjwbLXVVm3WXXrppfnEJz5RHgU2180335zNNttsiX//Xn755XkCrCXhtttuyz777JPm5mZ33AOADkgoBQAAAEDhFnxvYwAAAABYSoRSAAAAABROKAUAAABA4YRSAAAAABRuuboNSWtrayZOnJju3bu73S8AAABABZRKpcyYMSOrrrpqmzsPf9ByFUpNnDgxAwYMqHQZAAAAAB95r776alZfffUFrl+uQqnu3bsnef9N9+jRo8LVdCzNzc259957s9NOO6W2trbS5bAM0DO0l56hvfQM7aVnaC89Q3vpGdpLz8zf9OnTM2DAgHJOsyDLVSg195K9Hj16CKU+oLm5OV27dk2PHj38oLBY9AztpWdoLz1De+kZ2kvP0F56hvbSMwu3qKmVTHQOAAAAQOGEUgAAAAAUTigFAAAAQOGWqzmlAAAAgOVHa2trmpqaKl3GAjU3N6empiYNDQ2ZM2dOpcspTG1tbTp16vQ/70coBQAAAHQ4TU1NGT9+fFpbWytdygKVSqX0798/r7766iIn9V7e9OrVK/379/+f3rdQCgAAAOhQSqVS3njjjXTq1CkDBgxIdXXHnH2otbU17733Xrp169Zha1zSSqVSZs2albfffjtJssoqq3zofQmlAAAAgA6lpaUls2bNyqqrrpquXbtWupwFmnt5YefOnT8yoVSSdOnSJUny9ttvZ6WVVvrQl/J9dI4YAAAAsEyYOz9TXV1dhSthQeaGhc3NzR96H0IpAAAAoEP6qM3TtCxZEt8boRQAAADAUtbc3Jx//vOf813X2NiYxsbGlEqlgquqLKEUAAAAwFLwhz/8IaecckqS5JVXXsmnP/3pvPnmm/Ns96Mf/Sh9+/ZNv379FvpVXV2du+66q/y8c845J5MnT86Pf/zjfPvb387rr7+en/70p0mSffbZJ3ffffcCa9t+++0zZMiQbL/99vN8bbXVVvnYxz62hI/GvEx0DgAAACwTznpmUqGv991N+v1Pz7/wwgtz8MEHJ0nWWWed7LTTTrn00kvLwdFcZ511Vs4666xF7q9Xr17p3Llz+XF1dXUOP/zwbLbZZqmrq8vVV1+dqqqqzJkzJw888MBC91lfX585c+akpaVlnnVz5swpT2a+NAmlAAAAAJawm266KW+88UaOOOKI8rKf/OQn+cxnPpNjjjkmK6+8cpL3L+ubMWPGAu/gN2fOnMyZMyc9evRIkvI2kydPzsc+9rG0tLTkxRdfzHvvvZdVVlkl66+/fh566KH07NkzgwYNarOP/544fs6cOfnGN76RTTfddJ7XfOONN3LiiScuuYOxAEIpAAAAgCVo2rRpOe6443LRRRe1CYKGDBmS/fffP9/4xjdyyy23JEkee+yx7LDDDqmrq5tvKNXa2ppPfvKTeeCBB+Z5jbvuuiuvvfZaHnjggQwZMiR9+vTJE088kWeffTYzZszIwIEDM3PmzMyePTvf+c538sMf/rD8/IMOOij/+c9/cs8998z3PRx11FFL4lAslFAKAAAAYAkplUo59NBDs8kmm2TfffdNY2NjRo0aVR6RdMEFF2To0KE57rjjcuGFF2a77bab7yV0i7LGGmvkxBNPzP77759dd901ffr0ySuvvJKbbropm2yySc4777wcfvjh+b//+7+MGjWqHEgdc8wxefTRR9OrV6907dp1gftvaWnJ7373u2y33XY599xzP9zBWAShFAAAAMASUCqV8u1vfzv//Oc/8+yzzyZJ7rjjjnz1q1/NK6+8kj59+qRbt265+eabs/322+fNN9/MVVddlW7dumXQoEFpbGycZ58bbLBBm8nN57r//vtz5JFH5vrrr8/jjz+eSZMmZeONN85RRx2V8ePHZ+zYsUmSiRMnZtVVVy0/75JLLkmSXHvttZk2bdoC30tVVVWOPfbY/+l4LIq77wEAAAAsAW+99Vb++te/5s4770y/fu9Pkn7RRRfllFNOSZ8+fcrbfeITn8iDDz6YDTbYICussEKSZOrUqbntttvyyiuvlL9++tOf5t13353va22//fb5xz/+kQEDBuTyyy/P97///Zx44olZc801853vfCePPvpokmTcuHFZd91153n+WWedlTlz5mTgwIEZOHBgfvjDH6Z///4ZOHBg+vbtm+9+97tL+vDMw0gpAAAAgCWgf//+efrpp8tzQz3wwAN5/fXXyyOO3nnnnfzjH//IbrvtlsGDB2fw4MHl59bW1s53n1VVVfNdfuONN+aEE07I9OnTU19fn4022iiTJ0/OwQcfnMsuuywDBw7MW2+9lUcffXS+l99VV1fnl7/8ZTkUmz17di644IJUV1enpaVlvvNbLWlCKQAAAIAlZG6Y09ramhEjRuTMM89MfX19kuSll17KQQcdlGeffTYf//jH53nurrvu2mZi9JkzZ2bgwIHzfZ0DDzwwn//85/OJT3wid955Z9Zee+0MGTIkhxxySKqqqnLggQdm+PDh6dWrV5vL9+baaqut0rt37/K8Us8991w+85nPpLa2Ni0tLeW7Ay5NQikAAACAJeyCCy5Ily5dcsABB5SXrbvuutlll13y9a9/PXfffXeb7d9777089NBDGTJkSHnZn//85/z6178uPy6VSm2eM3r06Ky44ooZPnx4Vl999ZRKpWyxxRZJkq9//es5++yz87Of/Wye2v7+97/n2WefbROANTc35y9/+Us5VGtqasqjjz6aT33qUx/6GCyKUAoAAABgCbr88stz0kknZciQIdl6663z5ptv5s0330yfPn2y1lpr5V//+lduueWW7LvvvuXnfO1rX8tKK63UZj8777xzdt5559x6660ZN25cpk+fns6dO5fXb7zxxvnb3/6Wiy++OD/72c+y8cYbZ4899sivf/3rfP3rX8/222+fc845JzvvvHPWW2+9Ns/785//nH79+pUvD+zXr18efPDBdOvWLaVSKbNmzWoTWi0NQikAAABgmfDdTfpVuoTFsvLKK2fvvffOZpttlvXXXz/rrLNO1lxzzXTp0iVJcu655+bCCy9sE0qdffbZC9zfX/7ylzz88MP5/ve/n8033zxJMmnSpPzmN7/J9ddfn3XXXTd///vf069fv1x00UUZPHhwjjrqqJx88sk588wzs+WWW+aqq67KF77whdx111354Q9/mJ49e7aZr6qqqip77rlnm7mkpk+fniOOOCJHHHHEkj5ESYRSAAAAAEvUPvvsk3322WeB67/1rW/lG9/4xmLv7+KLL55nWa9evTJz5sxcf/31WX/99cvLu3Xrlj//+c/ZcMMNkyQnn3xyNt544/Jlfbvuumt23XXXxX7tpUkoBQAAAFCg+vr68uTnH1ZNTU2+973vzbN8fqOaOkoI9UFL//5+AAAAAPABQikAAAAACieUAgAAADqkUqlU6RJYgCXxvRFKAQAAAB1Kp06dkiRNTU0VroQFmTVrVpKktrb2Q+/DROcAAABAh1JTU5OuXbvmnXfeSW1tbaqrO+aYmtbW1jQ1NaWhoaHD1riklUqlzJo1K2+//XZ69epVDhA/DKEUAAAA0KFUVVVllVVWyfjx4/Of//yn0uUsUKlUyuzZs9OlS5dUVVVVupxC9erVK/379/+f9iGUAliGnfXMpEqXUFbd2pJBSS587t20Vlf+4+W7m/SrdAkAAPwP6urqss4663ToS/iam5vz8MMPZ9ttt/2fLmNb1tTW1v5PI6Tmqvy/GgAAAADmo7q6Op07d650GQvUqVOntLS0pHPnzh+pUGpJ+Whc8AgAAABAhyKUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwFQ2lpk6dmieeeCJTpkypZBkAAAAAFKxiodSNN96YgQMH5vDDD8/qq6+eG2+8MUnyrW99K1VVVeWvtddeu1IlAgAAALCUVCSUmjZtWr75zW/m4YcfzsiRI3PZZZdlxIgRSZInn3wyd955Z6ZMmZIpU6bkmWeeqUSJAAAAACxFNZV40enTp+dnP/tZBg8enCTZdNNN8+6776alpSWjRo3Ktttum27dulWiNAAAAAAKUJGRUgMGDMhBBx2UJGlubs6FF16YffbZJyNHjkxra2uGDBmSLl26ZJdddsmECRMqUSIAAAAAS1FFRkrN9eyzz+azn/1s6urq8sILL+TOO+/MoEGDcskll6Rfv3457rjjcuSRR+aee+6Z7/MbGxvT2NhYfjx9+vQk7wddzc3NhbyHZcXc4+G4sLj0zLKhurWl0iWUza2lo9Skdzs+5xnaS8/QXnqG9tIztJeemb/FPR5VpVKptJRrWaBSqZSnn346xx13XFZaaaXcdNNNbdZPmDAha6yxRqZMmZIePXrM8/xTTz01p5122jzLb7jhhnTt2nWp1Q0AAADA/M2aNSvDhw/PtGnT5pvnzFXRUGqu8ePHZ6211srkyZPTq1ev8vKGhoZ06dIlY8aMyaBBg+Z53vxGSg0YMCCTJk1a6Jv+KGpubs59992XYcOGpba2ttLlsAzQM8uGC597t9IllFW3tmSdiU/lpVWHprW6ogNxkyTHDe5b6RJYBOcZ2kvP0F56hvbSM7SXnpm/6dOnp1+/fosMpSryr4a//vWv+dOf/pRzzz03SVJXV5eqqqqcdtpp2XzzzTN8+PAkyeOPP57q6uoMGDBgvvupr69PfX39PMtra2s1wwI4NrSXnunYOkL480Gt1TUdoi59u+xwnqG99AztpWdoLz1De+mZthb3WFTkXw3rrrturrzyyqyzzjr53Oc+l1NOOSU77bRThg4dmlNOOSUrr7xy5syZk2OOOSZf+tKXXIoHAAAAsJypSCi1yiqr5Kabbsq3v/3tnHjiidl5551z3XXXZcUVV8yoUaOy3377pVOnTjn44INzxhlnVKJEAAAAAJaiil1fMWzYsIwaNWqe5WeeeWbOPPPMClQEAAAAQFGqK10AAAAAAB89QikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACieUAgAAAKBwQikAAAAACldT6QIAAAAAFtdZz0yqdAll1a0tGZTkwufeTWt15SOW727Sr9IltIuRUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUrqKh1NSpU/PEE09kypQplSwDAAAAgIJVLJS68cYbM3DgwBx++OFZffXVc+ONNyZJnn/++Wy++ebp3bt3RowYkVKpVKkSAQAAAFhKKhJKTZs2Ld/85jfz8MMPZ+TIkbnssssyYsSINDY2Zo899sjQoUPz5JNPZvTo0fnlL39ZiRIBAAAAWIoqEkpNnz49P/vZzzJ48OAkyaabbpp33303d999d6ZNm5YLLrgga621Vs4444xcc801lSgRAAAAgKWophIvOmDAgBx00EFJkubm5lx44YXZZ5998uyzz2arrbZK165dkySDBw/O6NGjF7ifxsbGNDY2lh9Pnz69vM/m5ual+A6WPXOPh+PC4tIzy4bq1pZKl1A2t5aOUpPe7ficZ2gvPUN76RnaS88sGzrK75uJ34EXZHHrqCpVcNKmZ599Np/97GdTV1eXF154IT/5yU/S0NCQyy67rLzNiiuumH//+9/p3bv3PM8/9dRTc9ppp82z/IYbbigHWwAAAAAUZ9asWRk+fHimTZuWHj16LHC7ioZSpVIpTz/9dI477ristNJKWWuttdLc3JwLLrigvM2AAQPy97//Pauttto8z5/fSKkBAwZk0qRJC33TH0XNzc257777MmzYsNTW1la6HJYBembZcOFz71a6hLLq1pasM/GpvLTq0LRWV2QgbhvHDe5b6RJYBOcZ2kvP0F56hvbSM8sGvwMvWEf5HXj69Onp16/fIkOpih6xqqqqDB06NL/61a+y1lpr5cwzz8zzzz/fZpsZM2akrq5uvs+vr69PfX39PMtra2udQBbAsaG99EzH1hE++D6otbqmQ9Slb5cdzjO0l56hvfQM7aVnOraO8LvmB/kduK3FraMiE53/9a9/zYgRI8qP6+rqUlVVlfXXXz+PP/54efn48ePT2NiYPn36VKJMAAAAAJaSioRS6667bq688spceeWVefXVV/O9730vO+20U3bddddMnz491157bZLkjDPOyI477phOnTpVokwAAAAAlpKKhFKrrLJKbrrpplx00UXZcMMNM2vWrFx33XWpqanJ1VdfnaOPPjr9+vXL7bffnrPPPrsSJQIAAACwFFXsgsdhw4Zl1KhR8yzfc889M3bs2Dz11FPZaqut0rdvx5ikCwAAAIAlp/KzcM1H//79s9tuu1W6DAAAAACWkopcvgcAAADAR5tQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKJxQCgAAAIDCCaUAAAAAKFzFQqnbb789a665ZmpqajJkyJC88MILSZJvfetbqaqqKn+tvfbalSoRAAAAgKWkIqHU2LFj85WvfCVnnXVWXn/99ay77ro5/PDDkyRPPvlk7rzzzkyZMiVTpkzJM888U4kSAQAAAFiKairxoi+88ELOOuusfOELX0iSfOMb38huu+2WlpaWjBo1Kttuu226detWidIAAAAAKEBFRkrtvvvuOfLII8uPX3zxxayzzjoZOXJkWltbM2TIkHTp0iW77LJLJkyYUIkSAQAAAFiKKjJS6r81NTXl/PPPz/HHH5/Ro0dn0KBBueSSS9KvX78cd9xxOfLII3PPPffM97mNjY1pbGwsP54+fXqSpLm5Oc3NzYXUv6yYezwcFxaXnlk2VLe2VLqEsrm1dJSa9G7H5zxDe+kZ2kvP0F56ZtnQUX7fTPwOvCCLW0dVqVQqLeVaFurkk0/O3XffnX/+85+pra1ts27ChAlZY401MmXKlPTo0WOe55566qk57bTT5ll+ww03pGvXrkutZgAAAADmb9asWRk+fHimTZs23zxnroqGUg8++GD23nvv/P3vf88GG2wwz/qGhoZ06dIlY8aMyaBBg+ZZP7+RUgMGDMikSZMW+qY/ipqbm3Pfffdl2LBh84R/MD96Ztlw4XPvVrqEsurWlqwz8am8tOrQtFZXfCBujhvct9IlsAjOM7SXnqG99AztpWeWDX4HXrCO8jvw9OnT069fv0WGUhU7YuPHj8+BBx6Yyy67rBxIjRgxIptsskmGDx+eJHn88cdTXV2dAQMGzHcf9fX1qa+vn2d5bW2tE8gCODa0l57p2DrCB98HtVbXdIi69O2yw3mG9tIztJeeob30TMfWEX7X/CC/A7e1uHVU5IjNnj07u+++e/baa6/ss88+ee+995IkgwcPzimnnJKVV145c+bMyTHHHJMvfelLLsUDAAAAWM5UJJS69957M3r06IwePTpXXXVVefn48ePzxS9+Mfvtt186deqUgw8+OGeccUYlSgQAAABgKapIKLXXXntlQVNZnXnmmTnzzDMLrggAAACAIlVXugAAAAAAPnqEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUrqbSBQAAAMuPs56ZVOkSyqpbWzIoyYXPvZvW6sr/0+e7m/SrdAkAHYqRUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOE+VCh10kknLXDdNddc86GLAQAAAOCj4UOFUn/961/nWfb444+npaUlV1111WLt4/bbb8+aa66ZmpqaDBkyJC+88EKS5Pnnn8/mm2+e3r17Z8SIESmVSh+mRAAAAAA6sMUKpQYOHJj1118/66yzTn72s5+ld+/eueSSS9K1a9d07949e+yxR4455pg0NTWle/fui9zf2LFj85WvfCVnnXVWXn/99ay77ro5/PDD09jYmD322CNDhw7Nk08+mdGjR+eXv/zl//oeAQAAAOhgFiuUWmWVVdKjR4+ce+65mTZtWmpra1MqlXLllVdmvfXWyyc+8YnU1dWla9eui/WiL7zwQs4666x84QtfyMorr5xvfOMbeeaZZ3L33Xdn2rRpueCCC7LWWmvljDPOcDkgAAAAwHKoZnE26ty5c+bMmZNVV101zz77bHl59+7dU1NTk5qaxdpN2e67797m8Ysvvph11lknzz77bLbaaqtyuDV48OCMHj16gftpbGxMY2Nj+fH06dOTJM3NzWlubm5XTcu7ucfDcWFx6ZllQ3VrS6VLKJtbS0epSe92fM4ztJeeWTZ0lM+BxGcT7ec8s2zoKD/TifPMgixuHe1Lkz6gsbExpVIpLS0taW1tzaxZs9o9B1RTU1POP//8HH/88Xn55ZezxhprlNdVVVWlU6dOmTJlSnr37j3Pc88888ycdtpp8yy/9957F3vU1kfNfffdV+kSWMbomY5tUKULmI91Jj5V6RKSJHe9VukKWFzOM7SXnunYfDYtmM+mZYfzTMfmPLNgHeU8M2vWrMXa7kOHUqVSKUcccUQaGhoycuTItLS05OMf/3iqq9s3d/qPfvSjrLDCCjn88MNzyimnpL6+vs36zp07Z9asWfMNpU4++eQcf/zx5cfTp0/PgAEDstNOO6VHjx4f7o0tp5qbm3Pfffdl2LBhqa2trXQ5LAP0zLLhwuferXQJZdWtLVln4lN5adWhaa3+n/7msUQcN7hvpUtgEZxnaC89s2zw2bRgPps6PueZZYPzzIJ1lPPM3CvZFuVDH7Gqqqpcd911Oeuss7LjjjvmgQceyGOPPZZhw4Yt9j4efPDBXHbZZfn73/+e2tra9OnTJ88//3ybbWbMmJG6urr5Pr++vn6eECtJamtrnUAWwLGhvfRMx9YRPvg+qLW6pkPUpW+XHc4ztJee6dg6wmfAB/lsor2cZzq2jvDz/EHOM20tbh3tG9Y0H1VVVR/qeePHj8+BBx6Yyy67LBtssEGSZPPNN8/jjz/eZpvGxsb06dPnfy0TAAAAgA5ksUKpqVOnZtq0aXn66afbLB83blxmzpyZ2bNnt+tFZ8+end133z177bVX9tlnn7z33nt57733ss0222T69Om59tprkyRnnHFGdtxxx3Tq1Kld+wcAAACgY1usUKpXr15ZZZVVcsstt+TjH/94Zs2alR49euSKK65Ic3NzJk+enKampkyaNGmxXvTee+/N6NGjc9VVV6V79+7lr9dffz1XX311jj766PTr1y+33357zj777P/pDQIAAADQ8SzWBY9/+ctf2jy+5JJL8uUvfzlf/vKXy8ueeOKJdOvWLVOnTl3k/vbaa68F3qVv4MCBGTt2bJ566qlstdVW6du3Y0zSBQAAAMCS86Fm4frUpz41z7Itt9wySfLjH//4f6soSf/+/bPbbrv9z/sBAAAAoGP6UBOdX3zxxQtc97nPfe5DFwMAAADAR0O7Qqnm5ub89Kc/Xeg2N910UyZPnvw/FQUAAADA8q1dl+916tQpZ599dt5+++2suuqqWXfddbP11lunf//+SZIXX3wxRxxxRH77299ml112WSoFAwAAALDsa9dIqerq6nTp0iUbbLBBZs2alVtuuSWf/OQns+mmm+aSSy7JjjvumJNOOkkgBQAAAMBCLdZIqT//+c9Za621svbaa6dbt275+te/Xl739ttvZ/jw4Tn22GPzmc98Jt/97neXWrEAAAAALB8Wa6TUFVdckaFDh+bjH/94pk6dml/84hf5/ve/n+233z5bbrlltthii4wfPz51dXU555xzlnbNAAAAACzjFmuk1C233JLW1tY89NBDufHGG3PmmWfmlVdeyXe+85089NBD5e1uuOGGDB48OHvssUfWX3/9pVUzAAAAAMu4xRopdfrpp+eMM87ISy+9lMbGxjzzzDMZPnx46urqcvTRR+cf//hHSqVSDjnkkJxwwgl57LHHlnbdAAAAACzDFmuk1Fe+8pVcffXVeeGFF1JXV5f9998/w4YNy/HHH5+ePXvmscceS6lUypprrplvf/vbS7lkAAAAAJZ1izVS6rzzzsusWbMyceLE/POf/8yKK66YSZMmZebMmenbt2+efvrprLnmmnn22WczadKkpV0zAAAAAMu4xQqlevfunV69eqVbt2555plnMmXKlPTt2zdDhgzJjBkzMm7cuLz22ms59thjjZQCAAAAYJEWK5Q6+uij889//jODBg3KDjvskLFjx2brrbfOAw88kIaGhuy6666pr6/PMccckxdffDEvvvji0q4bAAAAgGXYYoVSV199dbbeeuusueaaWX311XPjjTfmkEMOyQorrJAePXrk+eefz6WXXpokOeSQQ3Lrrbcu1aIBAAAAWLYt1kTnI0aMSJJMmTIlm2++eQYOHJgrrrgivXr1yvnnn5+ampoMHjw4SXLQQQelV69eS61gAAAAAJZ9ixVKzdW7d+/07t07SbLDDjskSQ444IA22/Tt23cJlQYAAADA8mqxQql77rknnTt3TnX1wq/269y5czbffPNUVVUtkeIAAAAAWD4tVii12267pUePHgvdplQqZcaMGTnllFNy2mmnLZHiAAAAAFg+LdZE53369MmUKVMW+jV16tScc845+eMf/7i0awYAAABgGbdYodTcy/FmzJiRYcOGLXC7tdZaK9tss82SqQwAAACA5Va7Jjrv1KlTHn/88STvX9I3efLkVFVVZc6cOVlvvfXyq1/9KnvvvffSqBMAAACA5chijZSaq1OnTuXJzkePHp2jjz463/zmNzN+/PgcfPDBS6VAAAAAAJY/7QqlPuiggw7KwQcfnG7dui30sj4AAAAA+G//UygFAAAAAB+GUAoAAACAwi3WROelUmm+y0eOHJnW1tY0NTVl5MiRWXHFFdO/f/8lWiAAAAAAy59FjpRqampKQ0NDkqS1tTWtra1Jkrq6ugwbNiy77757qqqqst122+Wiiy5autUCAAAAsFxYrJFSRx99dJKkoaEhjY2NSZIXX3xx6VUFAAAAwHJtkSOl6urqcuaZZyZJunfvngceeCBJMmvWrKVbGQAAAADLrXZNdF5TU5Ntt902SXLEEUfk9NNPXypFAQAAALB8+1B33zv++ONz7733Zq+99mqz/Lbbbst2221nFBUAAAAAC9WuUOqdd97Jfvvtl7vuuisPPfRQPvGJTyRJXn311ey66675yle+kj322CO1tbVLpVgAAAAAlg+LFUq9/fbb+cEPfpD11lsvK664Yp566qlsuOGGSZK77rorgwcPTs+ePTNkyJCceOKJQikAAAAAFmqRd9978skn8+lPfzo777xzHnzwwWy88cZt1m+xxRa56667su6662aTTTZZaoUCAAAAsPxYZCg1ZMiQPP/881l77bXbLJ86dWp69eqVfv36pV+/fkmSd999N42Njamvr1861QIAAACwXFjk5Xs1NTXzBFLPPPNMBg4cmPPPPz/Nzc3l5f3798+4ceOWfJUAAAAALFcWa06pH/zgB/m///u/tLS0JEk23HDDXHXVVbntttuy3nrr5cYbb0zyfij10ksvLb1qAQAAAFguLFYoNXjw4Fx//fVZd911c/vtt6euri77779/HnnkkVx55ZU566yzss0226SpqSkvvPDC0q4ZAAAAgGXcYoVS+++/fx577LGcc845OeaYY7LPPvtkypQpSZIddtghTz75ZPbaa68888wzefLJJ5dqwQAAAAAs+xYrlJrr85//fEaOHJlSqZTBgwdn5MiRSZKqqqqceOKJueyyy3LQQQctlUIBAAAAWH4s8u57H9SzZ8/cdtttOfbYY3PooYfm6aefLq/72te+tkSLAwAAAGD51O5Qaq6LLroob7311pKsBQAAAICPiHZdvvdBK6+88pKqAwAAAICPkP8plAIAAACAD0MoBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFK6iodSkSZOyxhpr5JVXXikv+9a3vpWqqqry19prr125AgEAAABYKmoq9cKTJk3K7rvv3iaQSpInn3wyd955Z7beeuskSadOnSpQHQAAAABLU8VGSh1wwAEZPnx4m2UtLS0ZNWpUtt122/Tq1Su9evVK9+7dK1QhAAAAAEtLxUKpq666Kt/61rfaLBs5cmRaW1szZMiQdOnSJbvssksmTJhQoQoBAAAAWFoqdvneGmusMc+y0aNHZ9CgQbnkkkvSr1+/HHfccTnyyCNzzz33zHcfjY2NaWxsLD+ePn16kqS5uTnNzc1Lp/Bl1Nzj4biwuPTMsqG6taXSJZTNraWj1KR3Oz7nGdpLzywbOsrnQOKzifZznlk2dJSf6cR5ZkEWt46qUqlUWsq1LLyAqqqMHz8+AwcOnGfdhAkTssYaa2TKlCnp0aPHPOtPPfXUnHbaafMsv+GGG9K1a9elUS4AAAAACzFr1qwMHz4806ZNm2+eM1eHDqUaGhrSpUuXjBkzJoMGDZpn/fxGSg0YMCCTJk1a6Jv+KGpubs59992XYcOGpba2ttLlsAzQM8uGC597t9IllFW3tmSdiU/lpVWHprW6YgNxy44b3LfSJbAIzjO0l55ZNvhsWjCfTR2f88yywXlmwTrKeWb69Onp16/fIkOpyh+x/zJixIhssskm5QnQH3/88VRXV2fAgAHz3b6+vj719fXzLK+trXUCWQDHhvbSMx1bR/jg+6DW6poOUZe+XXY4z9BeeqZj6wifAR/ks4n2cp7p2DrCz/MHOc+0tbh1VP6I/ZeNN944p5xySlZeeeXMmTMnxxxzTL70pS+5FA8AAABgOdOhQqmDDz44o0aNyn777ZdOnTrl4IMPzhlnnFHpsgAAAABYwioeSn1wSqszzzwzZ555ZoWqAQAAAKAI1ZUuAAAAAICPHqEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQuIqGUpMmTcoaa6yRV155pbzs+eefz+abb57evXtnxIgRKZVKlSsQAAAAgKWiYqHUpEmTsvvuu7cJpBobG7PHHntk6NChefLJJzN69Oj88pe/rFSJAAAAACwlFQulDjjggAwfPrzNsrvvvjvTpk3LBRdckLXWWitnnHFGrrnmmgpVCAAAAMDSUrFQ6qqrrsq3vvWtNsueffbZbLXVVunatWuSZPDgwRk9enQlygMAAABgKaqp1AuvscYa8yybPn16m+VVVVXp1KlTpkyZkt69e8+zfWNjYxobG9s8P0mam5vT3Ny8FKpeds09Ho4Li0vPLBuqW1sqXULZ3Fo6Sk16t+NznqG99MyyoaN8DiQ+m2g/55llQ0f5mU6cZxZkceuoKlV4JvGqqqqMHz8+AwcOzEknnZTm5uZccMEF5fUDBgzI3//+96y22mrzPPfUU0/NaaedNs/yG264oTzaCgAAAIDizJo1K8OHD8+0adPSo0ePBW5XsZFS89OnT588//zzbZbNmDEjdXV1893+5JNPzvHHH19+PH369AwYMCA77bTTQt/0R1Fzc3Puu+++DBs2LLW1tZUuh2WAnlk2XPjcu5Uuoay6tSXrTHwqL606NK3Vlf94OW5w30qXwCI4z9BeembZ4LNpwXw2dXzOM8sG55kF6yjnmblXsi1K5Y/Yf9l8881z1VVXlR+PHz8+jY2N6dOnz3y3r6+vT319/TzLa2trnUAWwLGhvfRMx9YRPvg+qLW6pkPUpW+XHc4ztJee6dg6wmfAB/lsor2cZzq2jvDz/EHOM20tbh0Vm+h8frbddttMnz491157bZLkjDPOyI477phOnTpVuDIAAAAAlqTKx3j/paamJldffXUOPPDAjBgxItXV1XnooYcqXRYAAAAAS1jFQ6kPzrO+5557ZuzYsXnqqaey1VZbpW/fjnE9JAAAAABLTsVDqfnp379/dtttt0qXAQAAAMBS0qHmlAIAAADgo0EoBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFIAAAAAFK5DhlLf+ta3UlVVVf5ae+21K10SAAAAAEtQTaULmJ8nn3wyd955Z7beeuskSadOnSpcEQAAAABLUocLpVpaWjJq1Khsu+226datW6XLAQAAAGAp6HCX740cOTKtra0ZMmRIunTpkl122SUTJkyodFkAAAAALEEdbqTU6NGjM2jQoFxyySXp169fjjvuuBx55JG555575tm2sbExjY2N5cfTp09PkjQ3N6e5ubmwmpcFc4+H48Li0jPLhurWlkqXUDa3lo5Sk97t+JxnaC89s2zoKJ8Dic8m2s95ZtnQUX6mE+eZBVncOqpKpVJpKdfyP5kwYULWWGONTJkyJT169Giz7tRTT81pp502z3NuuOGGdO3atagSAQAAAPj/zZo1K8OHD8+0adPmyXL+W4cPpRoaGtKlS5eMGTMmgwYNarNufiOlBgwYkEmTJi30TX8UNTc357777suwYcNSW1tb6XJYBuiZZcOFz71b6RLKqltbss7Ep/LSqkPTWl35gbjHDe5b6RJYBOcZ2kvPLBt8Ni2Yz6aOz3lm2eA8s2Ad5Twzffr09OvXb5GhVOWP2AeMGDEim2yySYYPH54kefzxx1NdXZ0BAwbMs219fX3q6+vnWV5bW+sEsgCODe2lZzq2jvDB90Gt1TUdoi59u+xwnqG99EzH1hE+Az7IZxPt5TzTsXWEn+cPcp5pa3HrqPwR+4CNN944p5xySlZeeeXMmTMnxxxzTL70pS+5HA8AAABgOdLhQqmDDz44o0aNyn777ZdOnTrl4IMPzhlnnFHpsgAAAABYgjpcKJUkZ555Zs4888xKlwEAAADAUlJd6QIAAAAA+OgRSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQuJpKFwAAFOesZyZVuoSy6taWDEpy4XPvprW68r+SfHeTfpUuoUPSMwumZ2DJcJ5ZMOcZlndGSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQOKEUAAAAAIUTSgEAAABQuJpKF7C8O+uZSZUuIUlS3dqSQUkufO7dtFZX/tv+3U36VboEAAAAoIKMlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcEIpAAAAAAonlAIAAACgcDWVLgD4f856ZlKlSyirbm3JoCQXPvduWqsrf6r47ib9Kl0CAAAAS5CRUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUTigFAAAAQOGEUgAAAAAUrkOGUs8//3w233zz9O7dOyNGjEipVKp0SQAAAAAsQR0ulGpsbMwee+yRoUOH5sknn8zo0aPzy1/+stJlAQAAALAEdbhQ6u677860adNywQUXZK211soZZ5yRa665ptJlAQAAALAE1VS6gA969tlns9VWW6Vr165JksGDB2f06NHz3baxsTGNjY3lx9OmTUuSTJ48Oc3NzUu/2MXQNH1KpUtIklS3tmTWrFlpmj4lrdWV/7a/+25VpUvokDpKvyR6ZlmhZxZMz8yfnlkwPTN/embB9Mz86ZkF0zPzp2cWTM/Mn55ZsI7SMzNmzEiSRU7HVFXqYBM2nXDCCWloaMhll11WXrbiiivm3//+d3r37t1m21NPPTWnnXZa0SUCAAAAsAivvvpqVl999QWur3yM9wE1NTWpr69vs6xz586ZNWvWPKHUySefnOOPP778uLW1NZMnT07fvn1TVdUx0sGOYvr06RkwYEBeffXV9OjRo9LlsAzQM7SXnqG99AztpWdoLz1De+kZ2kvPzF+pVMqMGTOy6qqrLnS7DhdK9enTJ88//3ybZTNmzEhdXd0829bX188TYPXq1WtplrfM69Gjhx8U2kXP0F56hvbSM7SXnqG99AztpWdoLz0zr549ey5ymw430fnmm2+exx9/vPx4/PjxaWxsTJ8+fSpYFQAAAABLUocLpbbddttMnz491157bZLkjDPOyI477phOnTpVuDIAAAAAlpQOd/leTU1Nrr766hx44IEZMWJEqqur89BDD1W6rGVefX19fvSjH81zuSMsiJ6hvfQM7aVnaC89Q3vpGdpLz9BeeuZ/0+HuvjfXm2++maeeeipbbbVV+vbtW+lyAAAAAFiCOmwoBQAAAMDyq8PNKQUAAADA8k8oBQAAAEDhhFIAAAAAFE4oBQAAAEDhhFLLmcsuu2yxtx02bFhaWlqWYjUsqyZOnFjpEuignGP4MPQN7aVnaC89Q3vpGdpLzywdQqnlzMknn5wkaWhomGddY2NjDj/88IwaNSpJ8tJLL6WmpqbQ+uj4Lr300my00UaZPn16xo8fn2nTplW6JDoQ5xg+DH1De+kZ2kvP0F56hvbSM0uHUGo5M7fxjznmmPz4xz8uL581a1Z23HHH/OUvf0lra2uSpLrat5+2rr322nz/+9/PDTfckB49euTHP/5xNt100/zzn/+sdGl0EM4xfBj6hvbSM7SXnqG99AztpWeWDkdqObXVVlvl6quvzmc+85m8/PLL2XfffTNp0qQ8+uij2WijjSpdHh3QrFmz8tOf/jR33XVXdt555yTJmWeemZ133jnbbLNNrrjiigpXSEfiHMOHoW9oLz1De+kZ2kvP0F56ZskSSi2nvvrVr2b06NFZffXVs95662XkyJG5//77079//0qXRgfVtWvXjBkzJp/61KfKy/r375+f//zn+cMf/pDjjz8+N9xwQwUrpCNxjuHD0De0l56hvfQM7aVnaC89s2QJpZZj3bp1y9Zbb50ePXrkvffey7/+9a9Kl0QH1alTp3Tr1i19+/ZNjx49yl8rrLBCamtrs/rqq+eXv/xlvv71r+eVV16pdLl0EM4xfBj6hvbSM7SXnqG99AztpWeWHDNvLcceeuihjBgxInfffXdaWlryxS9+Mddcc0322GOPSpdGBzNhwoR06dIlnTp1arN8zpw5aWhoyEorrZRNN900t99+e7773e/md7/7XYUqpSNxjuHD0De0l56hvfQM7aVnaC89swSVWK707t27VCqVSu+8805ptdVWK1188cXldY8++mipX79+paeffrpUKpVKa6yxRkVqpOMaM2ZMad999y397ne/K02dOnWe9a+88kqpvr6+NHbs2ApUR0fgHMOHoW9oLz1De+kZ2kvP0F56Zulw+d5yavTo0Rk6dGiOOeaY8rKtt946xx57bL74xS+mVCqlVCpVsEI6ooaGhnTu3Dk/+MEPstJKK2WbbbbJHXfcUV7/8Y9/PLvttluuu+66ClZJR+Acw4ehb2gvPUN76RnaS8/QXnpmyXL53nJq2223zbbbbjvP8u985zvZYIMNUlVVlaqqqgpURkfWr1+//OY3v0mSvPXWW7nrrruy4oorttlm9913N68UzjF8KPqG9tIztJeeob30DO2lZ5YsodRyprm5OY888shCk9l+/frl4YcfTkNDQ4GV0dFdeumlOffcc/Pyyy+ntrY2r776ar7yla/kX//6V+66667suuuuSZIDDzwwnTt3rnC1VIpzDB+GvqG99AztpWdoLz1De+mZpaOqZFzZcmWNNdZI165dU1298CszW1tbM3v27IwbN66gyujInnrqqey0007585//nM022yyjRo3KlltumQceeCD//Oc/c8YZZ6RTp07Zf//988UvfjFbbrllpUumQpxj+DD0De2lZ2gvPUN76RnaS88sHUIpIEny4osvZtCgQUmSXXbZJTvvvHOOO+64JEmpVMoDDzyQiy++OHfddVeefPLJDBkypILVAgAAsKwTSn1EjRo1qny9K3zQu+++m759+8533bPPPpuNN9644IpYlsyYMSNVVVXp1q1bpUsBAAA6MKHUcqa1tTUTJkzIN7/5zdx1110ZO3ZsSqVS1l577fI248aNy4477phOnTrlwQcfzIABAypYMR3BuHHjstFGG2XmzJnzrGtqakqpVEp9fX0FKqOj+uEPf5gf//jH8133q1/9KieccEL++Mc/5pOf/GTBldERHXroobn00kvTvXv3HHLIIVl11VVz/PHHZ+WVV55n22HDhuXGG29Mr169ii+UDqtUKqV///559dVXU1dXV+ly6ICampryt7/9Ldttt106deq0wO3GjRuXwYMH57333iuwOjqyt99+O4888khqa2vnWVcqldLU1JSmpqYcdNBBFaiOjuRTn/pU6urqFjmwo3PnzvnRj35kypPFZKLz5cyee+6Zj33sY3nggQeSJJdddll+/vOfZ/XVV8+OO+6YPffcMzvssEN23XXX3HfffenZs2eFK6YjqK+vT03NvKeDxsbG7L333vnYxz6WK664ogKV0VH96le/WmAodeihh+Y///lPnnrqKaEUSZLf/OY3KZVKue666zJq1Kh84hOfSGNjY3n9nDlz8qtf/SqHHXZYJk6caJQdmTp1aj7zmc/kmWeeSZJUVVVl8uTJAikWaNq0aRk2bFimTJmSHj16LHC7mpqaNDc3F1gZHd0rr7ySH/7wh6mtrZ1v2PD8889no402EkqR8ePH59Zbb02pVMoRRxyRn//856mtrc1ZZ52Vz3zmM9lqq62SJLfcckvOO++83HjjjRWueNkglFrO7L///vnSl76U3//+90mSCy64IOedd15efvnlXH311fniF7+YOXPmZPPNN89DDz200A9tPjpqamrm+atic3Nz9t5774wfPz5XXXVVhSqjo5rbL8OGDcukSZPa/BJXKpXy8ssv57rrrqtUeXQwQ4YMSWNjY44++uhMnDgx//nPf3LWWWelqqoqG220UY444oice+65Oeyww1JdXT3fkJyPlurq6rz88sttlplygIVZYYUVUiqV5jva5b/V1dUJN2ljiy22yKhRoxa4vnfv3nn66acLrIiOqnPnzuXRT927d89WW22V2trarLTSSll//fXL66qqqvLcc89VstRlit/6liMtLS156KGH8vTTT2fWrFk58MAD8+677+add97Jyy+/nJVXXjmf//zn06NHj9x4442ZPHlyVllllUqXTQd12GGHZezYsfnrX/+qTyh788038/jjj5cfv/zyy7nnnnvm2a5bt25ZbbXViiyNDqy6ujrDhw/PYYcdlrq6ugwdOjTJ+6Mxjz322Oy4446L/IckHy2dOnXSE7RLfX19qqqqUlVVldbW1gVu19raKuCkXfQLc73xxhv57Gc/m1KplFGjRuVzn/tc6uvr8/zzz+ftt9/O2LFjs/vuu2eLLbbIFltsUelylxlCqeVIU1NTBg0alO7du6euri4HHHBAevfunZkzZ2bQoEHp3bt3amtr061bt6y33nr53Oc+l6effjr9+vWrdOl0MG+++WbGjBmT++67TyBF2eTJkzNkyJAcfvjhbZbPvWsjzM+cOXPS1NSU1157LU888UT22muvfPWrXy2vHzduXFZYYYUKVggsL0ql0iLPJ6VSySXCwIfyt7/9LXV1dXnxxRez1lprpaWlJU1NTZk+fXpee+21/OMf/8gPf/jD7LXXXjnvvPPMj7mYhFLLka5du+a73/1ukuTUU0/NXnvtlSSpra3Nuuuu2+aH5rbbbsuuu+6aadOmCaVI8v5fDl999dXMvffBzTffnCSZMGFCeZtSqZSGhgYhxEdUnz598vTTT2fVVVfNDTfcUF5+1FFHpUuXLunZs2fWWWedbLHFFllzzTUrWCkdSalUypFHHplvfvObSZKdd945zc3Nqa2tzezZs3PiiSfmiSeeyIwZM3LXXXeZfJiyhoaGXHzxxeXHpVKpzePk/c+uhoaG8u8/8MILLyz05ixvv/12dthhhwIroiOaOXNmbrjhhkVOWj13onNIUh7pvdtuu2Xs2LGpra3NV77ylbzxxhupqqpKU1NTNttss0yZMiWf+tSnMnLkyFRXV1e46o5PKLWc+u+Ta/fu3dtcJ33MMcfk5ptvzuWXX16J0uigpk+fnoEDB5ZDqaqqqszv5pxVVVWZM2dO0eXRQay66qpJktmzZydJTj/99DQ0NKS1tTVvv/127r///hxzzDHZYYcdcuWVV5q3jtTU1OTaa6/N7373u1RXV6ehoSHPPvtsfvCDH+SAAw7Ic889l5deeilTpkzJVVddlSlTplS6ZDqIlpaW/OUvfyk/LpVKbR4n789/2NTUJJQic+bMSVVVVVZfffV07dp1gdt17ty5wKroqGbPnp3f/va36dy580Lv1pjExPiUXXrppamrq8usWbPyi1/8Ikly//3359e//nVKpVK++tWv5ve//3023XTT3HbbbQKpxSSUWk7NnDmzPFJh+vTp2WCDDdKjR4+stdZa+exnP5svfelLFa6QjqZXr1555513FrpNa2trmztm8dF11FFHpaWlJcOHD0+SPPLII+XL+s4999zsueee2W677fL3v/99oX+x5qOhoaEhX/3qV/Ovf/0rs2fPzvTp0/PWW29ljTXWyEsvvZQ77rgjgwcPzq233pqNNtqo0uXSQXTr1i233npr+XFdXV2bx/DfmpubUyqVFhkgtLS0+OMa6devXx588MHF2rZ3795LuRqWFc8991zq6+vT0tKSF198MaVSKY2Njdl6663T2tqarl27ZuONN05VVVX23XffSpe7zBBKLaeef/75TJgwIX369Enfvn3T1NSUd955J7/+9a9z1VVX5ac//WmuvPLKDBs2rNKl0oEs6i9FJp5lrk033TTnnXde9tlnnxx88MHp169fbrrppqywwgrp3bt37rjjjjz66KMCKZK8Px/Z3/72t7z66qtpbm7Oddddl7fffjs33nhjpk+fXunygOVAly5dctttt6V79+7zrJs8eXJWWGGF1NfXp6mpycTVtIt+Ya4rr7wySXL77bfn/PPPT1VVVR588MH06tUrnTp1Smtra3r27JkrrrgihxxySIWrXXYIpZZTa6yxRn76059ml112yf77758kGThwYH74wx/mjDPOyIQJE3LIIYfkpZdemu+HN8D8HH/88amvr89//vOfvPPOOxkzZkwaGhqyySab5PTTTy9vN3dy6913372C1dJR1NbWpmfPnnn33XfT1NSUHj16pLa2Nh/72Mdy1113pbW1NS0tLZUukw5GT9Ae77zzTg477LDccMMN2WmnndLQ0JCJEydmzTXXzMknn5wHHngg3/ve9zJ8+HBz1wEfyqxZs3LLLbfk9NNPT6lUyuuvv55nn302SfKPf/zDHfc+JKHUcqZPnz7la+WnTZuW2267Lccee2x5/fTp07PGGmvk5z//efbYYw+BFEneDxD88s/iWGuttcp/aX7jjTfy9NNPZ9SoUVlnnXWy/fbbl+cha2pqcqknZXV1dVlppZUyefLkNDQ0ZK+99sojjzySE044IT//+c8zZ86crLXWWkn8RZr3NTU1leeug8Vx7LHH5pOf/GR22mmnJMlFF12UCy64IOPGjctPf/rTrLfeejn99NNz/PHH54ADDshXvvKVbLnllhWumo5g1KhRC530vKGhoQJV0RFdeOGFeeONN3LppZfmP//5T4YMGZKRI0dmxowZ2XfffTNgwICccMIJ2Xfffc0n1Q5CqeXMqFGj0qVLlyTJt771reywww7Ze++9y+vfeuutDB06NKeffnr69u1boSrpaBoaGvzyz2I56qijkiR333131lprrXzzm9/MI488ku9973u56667ctlll5XDBZhrs802y7hx41JfX59OnTpl5ZVXznrrrZfq6upMmTIlM2fOzB//+MckSWNjY6ZNm5aePXtWuGoqqXfv3nnzzTfLj0ulUlpaWtLQ0GCiauZr//33z6c//ekk7/8R9pxzzsmZZ56ZFVZYISussEKOO+64HHvssfntb3+b73//++nTp49QiiTvj34ZO3Zsampq5gmlmpub87nPfa5CldHRjBkzJtdcc02S9/+t/e1vfzurr756kvfvWH7TTTfl9NNPz4knnpiLL744e+65ZyXLXWZUleZ3ey2WC3fccUfWWmutbLjhhm2Wn3feefnyl7+cd955J+uvv36FqqMjmTNnTt58882sttpqlS6FZcRbb72VCRMmZPPNNy8vu/zyy7PPPvtk5ZVXrmBlLGs+ONz9b3/7W7baaqvU1Pi7GW098cQT2WKLLdr8o7FUKuWdd97JSiutVMHK6GhmzJiRyy+/PCeccEJ5tMKECRPysY99rLy+ubk5ffr0qWSZdACTJk1Kv379kiQTJ05MS0tLuU9gYa677rocdNBB852T9ze/+U022mijDB48uAKVLXuEUsuxOXPmZOzYsVlnnXXmSf0nTJiQtddeO01NTRWqDljWfeELX8haa62VM888s9KlsIxoaGjIZz7zmTz++OPlZSussEJmzpyZJDn55JNz5ZVX5uWXX3a3o4+4WbNm5Xe/+10OOeSQhd5gY+zYsRk0aJBL0EmSbL755unUqVPOO++8fPrTn84bb7yRVVZZJa2trencuXOampry5ptv5tJLL81Pf/rTSpdLhTU1NWXAgAF56623kiQ33XRTpk6dWr6bMHzQ7NmzM2zYsPztb3+rdCnLFX+GXI5NmjQp66+/fiZNmjTPL/d1dXWpq6urUGXAsq6xsTEPPPBAtt122yTv/2J38MEH57e//W2qq6tz2GGHpX///gIryhOB1tfXp7W1NWPGjMmUKVNSKpVSKpVSW1ub2bNn56CDDsrDDz+c66+/XiBFZs2alSOOOCL777//QkOpuro6o+ooe+mll3LHHXdkwIABSZL11lsv06ZNS3V1dTp37pxp06bls5/9bDbZZJOUSiVz2H3E1dXVlac9+e1vf5uHH344DQ0NmTVrVvr165fRo0enuro6ra2t6dKlS77//e9XuGIqrXPnzpk6dWr58ZZbbjnfuchKpVKampryxBNPFFzhssmn+HKsrq4upVJpvrdkr6qqMvka8KH96le/SpcuXXLYYYclef/uarfccks6deqUr371q3n88cdz2223VbZIOoS5N9jo0qVLqqurM2PGjKy55prl9aVSKQ8//HCeeeaZPPbYY1l33XUrWC0dxQorrFAOLRfGH9n4b7W1tZk2bVqOOuqo/OlPf0rnzp3zve99L0lSXV2dY445Juutt16uv/56gRRJUv730LnnnpvevXunpaUld955Zy6//PJ07do1NTU1aWlpMZcdSd7/N/R/nztef/31PPDAA0neH7n7wAMP5Mgjj8yee+6ZO+64o1JlLnOEUsux6urqVFVV+QsisES9++67Oe2003LxxRdn++23zyOPPJL6+vpUVVXl9ttvz3PPPZdHH33UzRRIkvTs2TPNzc3lxyuuuGLeeeed8uM+ffpk5513zqhRo9K1a9dKlEgHNDdo+s9//rPQYOq/e4mPtlGjRqWqqirrrbdeJkyYkCSpqalJ586dy3eGPeecc9KjRw9/mCWzZ8/OSSedlClTpuTyyy9PfX19jj322MyYMSMXXXRR9t5779x777255ZZbMmLEiKyzzjqVLpkO4r9Dqbq6ugwaNChJMnPmzNx00005//zz2yxn0aQVACy25ubmHHLIIfnCF76Q/fbbL1/72tfKozGrqqqy1157Zeedd/YXRdr497//XQ4uS6VSXn311XxwSsvXX3/dL/3MY4MNNpinVz6oW7duBVVDR3XzzTdn+PDh6dmzZ5u5VOvq6vLDH/4wSXLRRRelf//+lSyTDqS6ujrdu3dPVVVVunTpMs/IuZkzZ+acc87J/vvvX54IHZJk/Pjx2WmnnVIqlfLWW29lp512SpK8/fbbWWWVVSpc3bJJKLWcmd+tkg1PBpaEBx54IOeee24+/vGP54ILLkjy/vnlxz/+cUqlUlpbW/PjH/+4vH3//v1z5JFHVqpcOojW1tZsscUW5VBq6tSpbe62N2fOnPzpT3/KoYcemj/84Q/ZYYcdKlgtHUlVVVVmzZo132kI5nrzzTdd8kk++9nP5tVXX80nPvGJJCmP1P3gnRonT56cmpqa9OjRoyJ10nHU19fn9NNPzw033JCWlpZMnDgxt99+exoaGvLWW2/lqquuym677ZbZs2fnmmuuyezZs/ODH/yg0mXTAfTv3z/nnXdeWltbs9tuu+Xcc89N8n7QOXDgwCT+/d1eQqnlyOuvv54BAwbM8wE8vxELpVLJXxaBdvnDH/6QRx55JCeddFKb88ykSZPKIxnmXkrT1NSUM888M+uvv3622WabitRLx1BdXV2eFLS5uTn9+vXLG2+8UV7fs2fP7LLLLjn88MOz++675/zzz883v/nNClVLRzFnzpw2/12QuaPv+Gj74M0RVltttey+++556623sueeeyZ5/3ffE088MRMmTMif//zn+d7GnY+mN998M01NTXnnnXfS3NycpqamvPrqq+X1ra2tmT17dgUrpCOpq6vLgAED0tramtra2my88caVLmmZJ5Rajqy88soZOXJk+a/RC/POO++UhxoCLI7LL78866+/fnbfffecdtppOfHEE5MkF198cZLkiiuuyCWXXFLevrm5Od27d69IrXRMVVVVOfvss8uPS6VS1l9//dTU1OTss8/OlltumQMPPDDbb799NthggwpWSqU1NzenVCq1mY9sQdstKrjio2PGjBn50pe+lE033TR9+vTJgQcemOT9c81f/vKXXHrppdlzzz3zzW9+M1dccUWFq6XS7rjjjjQ2NmbXXXfNn/70pxx++OGZMWNG3n333VxwwQW5+OKLc8ABB2TllVeudKl0IGPHjs2nPvWplEqlTJw4Mauttlr69u2bz3/+8zn++OMN/PgQhFLLkZqammy44YaLta0fFqC9qqqq8u1vfzuf/OQns8cee6SpqSmzZs1KY2Nj+fKaiRMn5oADDsgdd9yRX/ziFxWumI5m2rRpufrqq/P1r389//rXv/KjH/0of//738vr99133zz77LNZb731KlglHcEKK6yQe++9Nz179pxn3YQJE9K1a9f069cvpVKpzd0c+Wj7yU9+ku7du+czn/lMm8s6W1tb8+1vfztdu3bN73//+xx11FFtPrv4aDr99NMzderUXH755fOdU+rpp5/Oz372s/zgBz8o320Y1llnnYwcOTJJssYaa2T8+PEZN25cLr300gwePDhPPPGEEbztJJQCoF223HLL3H///dlmm21yzTXXlH+pL5VK6d+/f/r165e99947999/v7t/kiQ54ogjUl9fn+bm5rz00kv51re+lVKplHvuuSff+MY3yndXq66uztZbby2UImPHjs1ee+2VW265JTvvvHOmTp2akSNHZptttslpp52WW2+9NUcffXSOOOKIPP/885Uulw5g1qxZuemmm/K1r30t6667bm655ZZst912qampyfDhw/Ozn/0syfvzTf3ud7+rbLF0CE888UTWXHPNXHHFFRk6dGh5+Zw5czJx4sQcdNBBufDCC/0uQxv/HWDOmTMn48ePT5IcffTR2WijjTJjxow0Nzdn/PjxWeP/a+/eQqJq2zCOX7MoQ0sxCaEax80yKAUjCALTUAg1LAwrrGhPgUXQ7sCINgZiQiR01giRhtJBRJQWlUmTYASBtDWLLNPIjNAKk3JG5zv4aD7kq9f1vvmumej/O3tmPbf3fThePM+axMRgjflb4fdQ/1B+v58EF8A/lpaWppMnT2rHjh3q6enRt2/fAldoamtr1dfXp+Li4iBPiVCRkJCg+Ph4xcXFaeLEiTJNU8nJySooKFBLS4tM05RpmoqOjtb69ev16dOnYI+MINu+fbtWr16t3NxcSVJFRYXWrl2rL1++yO126/Tp02publZSUpLy8vJ0+fLlIE+MYNuwYYPCwsJUVFSkly9fauvWrXr69KkmTZqkwsJCHTlyRJmZmWptbQ32qAgh3/8fysvLC3yWn5+v9vZ27du3T06nUytXrpTH4wnShAglw8PDo66VG4ah/Px8FRQUaPny5aqsrFRBQYEMw9DixYs1MDAQxGl/Hw4/ycQfqbOzU7NmzRrzXQ0A8FcWLVqkgwcPKicnZ9RViLa2NjU2NmrXrl1BnhChpK+vT3Pnzg28QPbJkyeaP3++Ojo6NGPGDEnS9OnTdf36daWlpQVzVATZvXv3NHv2bEVGRurdu3dKTk5WTU2NVqxYMWpfS0uL9u7dq8LCQpWUlARpWoSCR48eyel0aurUqSovL5dhGNq/f3/gudfrVXl5uY4dO6bDhw/rwIEDQZwWocDr9crpdKq3t1eSdP78eQ0ODmrjxo2BPZ8/f1ZVVZXcbrcePHigiIiIYI2LEDA4OKjU1NTA6SiMD0KpP9TXr1/V2tqq9PT0YI8C4Dc2MDDAO+pg2dDQkO7cuaOsrKzAZx6PZ9R6aGhIYWFh9g+HkOXz+XTx4kWtWrXqh8+Hh4c1NDSk8PBwmydDqPL7/RoZGfnhL+y1tLRozpw5iomJCcJkCCV+v19tbW2Bd/J2dXXJ6/XKNM3/28s7yPDdyMiIDIMLZ+OJUAoAAAAAAAC2I+IDAAAAAACA7QilAAAAAAAAYDtCKQAAAAAAANiOUAoAAAAAAAC2I5QCAAD4Bzo7O+VwOH66/rd5PB4lJCTY1g8AAGC8EUoBAACMA5fLpf7+/r9dl5CQII/H87frMjIy9PDhQ9v6AQAAjDdCKQAAgHFgGIaio6Nt6zdhwgRFRUXZ1g8AAGC8EUoBAABY1NDQoOTkZE2bNk1nz54d9exn1/du3ryplJQURUREKD09XS9evJAk5eXlyeFw6PXr18rOzpbD4VBFRUWgLisrS9XV1aqsrFR8fLyuXr066u/+7PpeU1OT0tLSFBkZqSVLlujNmzeW+gEAANiNUAoAAMCC3t5eFRUVqaSkRHfv3lVDQ4OlunXr1mnz5s169uyZUlJSdOjQIUnShQsX1N/fr7i4ONXX16u/v1979uwZVet2u9XY2Ci3260FCxaM2evVq1datmyZdu/erba2NkVFRWnnzp2W+wEAANhpQrAHAAAA+B1cu3ZNiYmJ2rZtmySptLRU+fn5Y9aFh4fL6/UqJiZGVVVV8vl8kqTJkydL+u+1vylTpvzw6t/AwIBu376tsLAwSzOeO3dOmZmZ2rJliyTpxIkTun//vuV+AAAAduKkFAAAgAU9PT1yuVyBtWmalupqa2t169YtzZw5U9nZ2Xr8+LHlnsXFxZYDKUnq7u5WUlJSYO10OrV06VLL9QAAAHYilAIAALAgNjZWb9++Day7urrGrBkcHJTP51NjY6M+fPigjIwMbdq0adQewzDk9/t/WP/9dJNVcXFx6uzsDKyfP3+uefPmaWRkxFI/AAAAOxFKAQAAWJCTk6P29nbV1NSoo6NDpaWlY9b4fD7l5uaqrq5O79+/l9/vD1zf+840Td24cUM9PT1qamr6pRnXrFmj5uZmVVdXq7u7W2VlZYqNjZVh/O8r33j2AwAA+BWEUgAAABY4nU7V1dXp6NGjysjI0MKFC8esiYqKUm1trcrKymSapurr63Xq1KlRe44fP64rV67I5XJZCrr+SmJioi5duqTKykqlpqbq48ePOnPmzL/WDwAA4Fc4/JzfBgAAAAAAgM04KQUAAAAAAADbEUoBAAAAAADAdoRSAAAAAAAAsB2hFAAAAAAAAGxHKAUAAAAAAADbEUoBAAAAAADAdoRSAAAAAAAAsB2hFAAAAAAAAGxHKAUAAAAAAADbEUoBAAAAAADAdoRSAAAAAAAAsN1/AER1lvPYai2SAAAAAElFTkSuQmCC",
      "text/plain": [
       "<Figure size 1200x800 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "res_df=df.groupby('district').size().reset_index(name='公司数量')\n",
    "res_df.set_index('district',inplace=True)\n",
    "\n",
    "# 设置中文字体，避免显示乱码\n",
    "plt.rcParams['font.sans-serif'] = ['SimHei'] # 例如使用黑体\n",
    "plt.rcParams['axes.unicode_minus'] = False # 解决负号显示问题\n",
    "\n",
    "# 绘制柱状图\n",
    "# 使用 kind='bar' 指定柱状图[1,3,5](@ref)\n",
    "ax = res_df.plot(kind='bar', \n",
    "                title='杭州市各区公司数量分布',\n",
    "                xlabel='district',\n",
    "                ylabel='公司数量',\n",
    "                figsize=(12, 8), # 调整图形大小\n",
    "                color='skyblue', # 设置颜色\n",
    "                grid=True) # 显示网格\n",
    "\n",
    "# 可选：在柱子上方显示数量标签\n",
    "# for i, v in enumerate(res_df):\n",
    "#     ax.text(i, v + 0.1, str(v), ha='center', va='bottom') # 调整标签位置\n",
    "\n",
    "# 自动调整布局\n",
    "plt.tight_layout()\n",
    "# 显示图形\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 聚合"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pandas的`groupby`操作后的聚合方法非常丰富，它们构成了数据分析中“拆分-应用-合并”范式的核心。下面的表格为你系统地总结了这些方法 。\n",
    "\n",
    "### 聚合方法速查表\n",
    "\n",
    "| 方法名称 | 描述说明 | 常用场景示例 |\n",
    "| :--- | :--- | :--- |\n",
    "| **基础统计方法** |\n",
    "| `sum()` | 计算每组非NA值的和。 | 计算各产品的总销售额 。 |\n",
    "| `mean()` | 计算每组非NA值的平均值。 | 计算各班级的平均分 。 |\n",
    "| `median()` | 计算每组非NA值的中位数。 | 计算房价中位数，避免极端值影响 。 |\n",
    "| `min()` / `max()` | 返回每组非NA值的最小值/最大值。 | 查找每月最高/最低温度 。 |\n",
    "| `std()` / `var()` | 计算每组无偏标准差/方差（分母n-1）。 | 分析投资组合收益的波动性 。 |\n",
    "| `count()` | 返回每组非NA值的数量。 | 统计每个部门的员工人数 。 |\n",
    "| `size()` | 返回每组的大小，包括NA值。 | 统计每个分类下的总记录数 。 |\n",
    "| `first()` / `last()` | 返回每组的第一个/最后一个非NA值。 | 获取用户最早/最近一次登录记录 。 |\n",
    "| `prod()` | 计算每组非NA值的乘积。 | 计算连续复利下的资产增长率。 |\n",
    "| **分位数与描述统计** |\n",
    "| `quantile(q=0.5)` | 计算每组的分位数（q为分位点，如0.5为中位数）。 | 分析收入分布的75%分位点 。 |\n",
    "| `describe()` | 生成描述性统计摘要，包括计数、均值、标准差、最小值、分位数等。 | 快速了解数值型变量的分布情况 。 |\n",
    "| `sem()` | 计算每组均值的标准误（Standard Error of the Mean）。 | 评估样本均值估计总体均值的可靠性。 |\n",
    "| **其他特定方法** |\n",
    "| `nunique()` | 返回每组唯一值的数量。 | 统计每个客户购买的不同商品种类数。 |\n",
    "| `cumsum()` / `cummax()` | 计算每组的累计和/累计最大值。 | 计算累计销售额或创建累计最大值图表 。 |\n",
    "| `ohlc()` | 对时间序列数据计算开（Open）、高（High）、低（Low）、收（Close）。 | 金融股票数据分析 。 |\n",
    "| `agg()` / `aggregate()` | **核心方法**：支持一次性应用单个、多个或针对不同列的不同聚合函数，也支持自定义函数。 | 极尽灵活，可完成复杂聚合任务 。 |\n",
    "\n",
    "### 聚合操作的核心技巧\n",
    "\n",
    "掌握上述方法后，了解如何组合使用它们能极大提升你的数据处理能力。\n",
    "\n",
    "1.  **多函数聚合与`agg()`方法**：`agg()`是功能最强大的聚合方法。你可以一次性对同一列应用多个函数，或对不同列应用不同的函数 。\n",
    "    ```python\n",
    "    # 对销售额列同时计算总和、平均值和标准差\n",
    "    result = df.groupby('产品类别')['销售额'].agg(['sum', 'mean', 'std'])\n",
    "\n",
    "    # 对不同列应用不同聚合函数：对销售额求和，对利润求平均值\n",
    "    result = df.groupby('产品类别').agg({'销售额': 'sum', '利润': 'mean'})\n",
    "\n",
    "    # 使用命名聚合（更清晰的列名）\n",
    "    result = df.groupby('产品类别').agg(\n",
    "        总销售额=('销售额', 'sum'),\n",
    "        平均利润=('利润', 'mean'),\n",
    "        最高评分=('用户评分', 'max')\n",
    "    )\n",
    "    ```\n",
    "\n",
    "2.  **处理分组结果**：默认情况下，分组依据的列会成为结果的索引。使用`as_index=False`或在聚合后使用`reset_index()`可以将其变回普通列 。\n",
    "    ```python\n",
    "    # 方法一：在分组时设置\n",
    "    result = df.groupby('城市', as_index=False)['销售额'].sum()\n",
    "\n",
    "    # 方法二：聚合后重置索引\n",
    "    result = df.groupby('城市')['销售额'].sum().reset_index()\n",
    "    ```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 17 - 聚合统计\n",
    "\n",
    "分组计算不同行政区，薪水的最小值、最大值和平均值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "('salary', 'min')",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "('salary', 'max')",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "('salary', 'mean')",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "79f94fd8-0e8a-48f4-95cf-4df5d6366409",
       "rows": [
        [
         "上城区",
         "22500",
         "30000",
         "26250.0"
        ],
        [
         "下沙",
         "30000",
         "30000",
         "30000.0"
        ],
        [
         "余杭区",
         "7500",
         "60000",
         "33583.33"
        ],
        [
         "拱墅区",
         "24000",
         "30000",
         "28500.0"
        ],
        [
         "江干区",
         "3500",
         "45000",
         "25250.0"
        ],
        [
         "滨江区",
         "7500",
         "50000",
         "31428.57"
        ],
        [
         "萧山区",
         "25000",
         "45000",
         "36250.0"
        ],
        [
         "西湖区",
         "6500",
         "45000",
         "30893.94"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th colspan=\"3\" halign=\"left\">salary</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>min</th>\n",
       "      <th>max</th>\n",
       "      <th>mean</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>district</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>上城区</th>\n",
       "      <td>22500</td>\n",
       "      <td>30000</td>\n",
       "      <td>26250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>下沙</th>\n",
       "      <td>30000</td>\n",
       "      <td>30000</td>\n",
       "      <td>30000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>余杭区</th>\n",
       "      <td>7500</td>\n",
       "      <td>60000</td>\n",
       "      <td>33583.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>拱墅区</th>\n",
       "      <td>24000</td>\n",
       "      <td>30000</td>\n",
       "      <td>28500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>江干区</th>\n",
       "      <td>3500</td>\n",
       "      <td>45000</td>\n",
       "      <td>25250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>滨江区</th>\n",
       "      <td>7500</td>\n",
       "      <td>50000</td>\n",
       "      <td>31428.57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>萧山区</th>\n",
       "      <td>25000</td>\n",
       "      <td>45000</td>\n",
       "      <td>36250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西湖区</th>\n",
       "      <td>6500</td>\n",
       "      <td>45000</td>\n",
       "      <td>30893.94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         salary                 \n",
       "            min    max      mean\n",
       "district                        \n",
       "上城区       22500  30000  26250.00\n",
       "下沙        30000  30000  30000.00\n",
       "余杭区        7500  60000  33583.33\n",
       "拱墅区       24000  30000  28500.00\n",
       "江干区        3500  45000  25250.00\n",
       "滨江区        7500  50000  31428.57\n",
       "萧山区       25000  45000  36250.00\n",
       "西湖区        6500  45000  30893.94"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('district').agg({\n",
    "    'salary':['min','max','mean']\n",
    "}).round(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 18 - 聚合统计｜修改列名\n",
    "\n",
    "将上一题的列名（包括索引名）修改为中文"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "行政区",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "最低工资",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "最高工资",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "平均工资",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "0dcea7d6-4603-4500-9fff-5cf3cc0274d0",
       "rows": [
        [
         "上城区",
         "22500",
         "30000",
         "26250.0"
        ],
        [
         "下沙",
         "30000",
         "30000",
         "30000.0"
        ],
        [
         "余杭区",
         "7500",
         "60000",
         "33583.33"
        ],
        [
         "拱墅区",
         "24000",
         "30000",
         "28500.0"
        ],
        [
         "江干区",
         "3500",
         "45000",
         "25250.0"
        ],
        [
         "滨江区",
         "7500",
         "50000",
         "31428.57"
        ],
        [
         "萧山区",
         "25000",
         "45000",
         "36250.0"
        ],
        [
         "西湖区",
         "6500",
         "45000",
         "30893.94"
        ]
       ],
       "shape": {
        "columns": 3,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>最低工资</th>\n",
       "      <th>最高工资</th>\n",
       "      <th>平均工资</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>行政区</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>上城区</th>\n",
       "      <td>22500</td>\n",
       "      <td>30000</td>\n",
       "      <td>26250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>下沙</th>\n",
       "      <td>30000</td>\n",
       "      <td>30000</td>\n",
       "      <td>30000.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>余杭区</th>\n",
       "      <td>7500</td>\n",
       "      <td>60000</td>\n",
       "      <td>33583.33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>拱墅区</th>\n",
       "      <td>24000</td>\n",
       "      <td>30000</td>\n",
       "      <td>28500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>江干区</th>\n",
       "      <td>3500</td>\n",
       "      <td>45000</td>\n",
       "      <td>25250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>滨江区</th>\n",
       "      <td>7500</td>\n",
       "      <td>50000</td>\n",
       "      <td>31428.57</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>萧山区</th>\n",
       "      <td>25000</td>\n",
       "      <td>45000</td>\n",
       "      <td>36250.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西湖区</th>\n",
       "      <td>6500</td>\n",
       "      <td>45000</td>\n",
       "      <td>30893.94</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      最低工资   最高工资      平均工资\n",
       "行政区                        \n",
       "上城区  22500  30000  26250.00\n",
       "下沙   30000  30000  30000.00\n",
       "余杭区   7500  60000  33583.33\n",
       "拱墅区  24000  30000  28500.00\n",
       "江干区   3500  45000  25250.00\n",
       "滨江区   7500  50000  31428.57\n",
       "萧山区  25000  45000  36250.00\n",
       "西湖区   6500  45000  30893.94"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(df.groupby('district')\n",
    "     .agg(\n",
    "         最低工资=('salary','min'),\n",
    "         最高工资=('salary','max'),\n",
    "         平均工资=('salary','mean')\n",
    "     ) # 聚合同时重命名列\n",
    "     .round(2)\n",
    "     .rename_axis('行政区') # 直接修改索引名\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 19 - 聚合统计｜组合\n",
    "\n",
    "对不同岗位(`positionName`)进行分组，并统计其薪水(`salary`)中位数和得分(`score`)均值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "positionName",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "salary_median",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "score_mean",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "50c14c27-3394-4e8c-b0a7-7e1eb7d238c9",
       "rows": [
        [
         "BI数据分析师",
         "20000.0",
         "2.67"
        ],
        [
         "bi数据分析师",
         "40000.0",
         "5.0"
        ],
        [
         "业务与数据分析师",
         "30000.0",
         "3.0"
        ],
        [
         "产品经理/数据分析（核心业务）-2020届春招",
         "60000.0",
         "3.0"
        ],
        [
         "产品运营（偏数据分析）",
         "27500.0",
         "15.0"
        ],
        [
         "商业数据分析",
         "35000.0",
         "0.0"
        ],
        [
         "商业数据分析师",
         "37500.0",
         "5.0"
        ],
        [
         "商业数据分析师（阿里数据银行）",
         "22500.0",
         "3.0"
        ],
        [
         "大数据分析工程师(J11108)",
         "30000.0",
         "17.0"
        ],
        [
         "大数据建模总监",
         "37500.0",
         "14.0"
        ],
        [
         "奔驰·耀出行-BI数据分析专家",
         "30000.0",
         "0.0"
        ],
        [
         "奔驰耀出行-战略数据分析师",
         "42500.0",
         "1.0"
        ],
        [
         "店铺数据分析师",
         "30000.0",
         "6.0"
        ],
        [
         "数据分析",
         "30000.0",
         "82.71"
        ],
        [
         "数据分析-2020届春招",
         "30000.0",
         "4.0"
        ],
        [
         "数据分析专员",
         "26250.0",
         "3.0"
        ],
        [
         "数据分析专家",
         "31250.0",
         "8.17"
        ],
        [
         "数据分析专家-LQ(J181203029)",
         "21500.0",
         "0.0"
        ],
        [
         "数据分析专家03-10-217",
         "23750.0",
         "4.5"
        ],
        [
         "数据分析专家（游戏业务）",
         "37500.0",
         "12.0"
        ],
        [
         "数据分析实习生",
         "40000.0",
         "4.0"
        ],
        [
         "数据分析实习生 (MJ000087)",
         "26500.0",
         "3.0"
        ],
        [
         "数据分析工程师",
         "20000.0",
         "16.0"
        ],
        [
         "数据分析师",
         "37500.0",
         "6.5"
        ],
        [
         "数据分析师 (MJ000250)",
         "27500.0",
         "4.5"
        ],
        [
         "数据分析师(J10147)",
         "37500.0",
         "3.0"
        ],
        [
         "数据分析师-Lark",
         "30000.0",
         "2.0"
        ],
        [
         "数据分析师-企业SaaS应用",
         "40000.0",
         "2.0"
        ],
        [
         "数据分析师/BI",
         "45000.0",
         "5.0"
        ],
        [
         "数据分析师（保险）13-01-19",
         "40000.0",
         "4.0"
        ],
        [
         "数据分析师（社招）",
         "30000.0",
         "15.0"
        ],
        [
         "数据分析师（财务方向）",
         "37500.0",
         "5.0"
        ],
        [
         "数据分析建模工程师",
         "30000.0",
         "0.0"
        ],
        [
         "数据分析建模工程师（校招）",
         "36500.0",
         "0.0"
        ],
        [
         "数据分析经理",
         "30000.0",
         "6.5"
        ],
        [
         "数据分析负责人 or 数据分析师",
         "30000.0",
         "4.0"
        ],
        [
         "数据建模",
         "15000.0",
         "176.0"
        ],
        [
         "数据建模专家-杭州-01546",
         "30000.0",
         "12.0"
        ],
        [
         "数据建模工程师",
         "36250.0",
         "24.0"
        ],
        [
         "旅游大数据分析师（杭州）",
         "30000.0",
         "1.0"
        ],
        [
         "智能数据分析引擎研发专家",
         "30000.0",
         "3.0"
        ],
        [
         "浙江数据分析师",
         "37500.0",
         "5.0"
        ],
        [
         "解决方案顾问/数据分析师",
         "25000.0",
         "4.5"
        ],
        [
         "财务数据分析师",
         "37500.0",
         "4.5"
        ],
        [
         "资深数据分析/数据分析专家G00796",
         "45000.0",
         "4.0"
        ],
        [
         "资深数据分析专员",
         "30000.0",
         "1.0"
        ],
        [
         "资深数据分析师",
         "30000.0",
         "6.67"
        ],
        [
         "资深数据分析师 (MJ000088)",
         "25000.0",
         "4.0"
        ],
        [
         "资深数据分析师（商品方向）G01053",
         "45000.0",
         "4.0"
        ],
        [
         "资深数据分析师（杭州）",
         "37500.0",
         "15.0"
        ]
       ],
       "shape": {
        "columns": 2,
        "rows": 55
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>salary_median</th>\n",
       "      <th>score_mean</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>positionName</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>BI数据分析师</th>\n",
       "      <td>20000.0</td>\n",
       "      <td>2.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>bi数据分析师</th>\n",
       "      <td>40000.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>业务与数据分析师</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>产品经理/数据分析（核心业务）-2020届春招</th>\n",
       "      <td>60000.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>产品运营（偏数据分析）</th>\n",
       "      <td>27500.0</td>\n",
       "      <td>15.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>商业数据分析</th>\n",
       "      <td>35000.0</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>商业数据分析师</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>商业数据分析师（阿里数据银行）</th>\n",
       "      <td>22500.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>大数据分析工程师(J11108)</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>17.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>大数据建模总监</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>14.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>奔驰·耀出行-BI数据分析专家</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>奔驰耀出行-战略数据分析师</th>\n",
       "      <td>42500.0</td>\n",
       "      <td>1.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>店铺数据分析师</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>6.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>82.71</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析-2020届春招</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析专员</th>\n",
       "      <td>26250.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析专家</th>\n",
       "      <td>31250.0</td>\n",
       "      <td>8.17</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析专家-LQ(J181203029)</th>\n",
       "      <td>21500.0</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析专家03-10-217</th>\n",
       "      <td>23750.0</td>\n",
       "      <td>4.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析专家（游戏业务）</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>12.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析实习生</th>\n",
       "      <td>40000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析实习生 (MJ000087)</th>\n",
       "      <td>26500.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析工程师</th>\n",
       "      <td>20000.0</td>\n",
       "      <td>16.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>6.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师 (MJ000250)</th>\n",
       "      <td>27500.0</td>\n",
       "      <td>4.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师(J10147)</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师-Lark</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>2.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师-企业SaaS应用</th>\n",
       "      <td>40000.0</td>\n",
       "      <td>2.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师/BI</th>\n",
       "      <td>45000.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师（保险）13-01-19</th>\n",
       "      <td>40000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师（社招）</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>15.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析师（财务方向）</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析建模工程师</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析建模工程师（校招）</th>\n",
       "      <td>36500.0</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析经理</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>6.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据分析负责人 or 数据分析师</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据建模</th>\n",
       "      <td>15000.0</td>\n",
       "      <td>176.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据建模专家-杭州-01546</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>12.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>数据建模工程师</th>\n",
       "      <td>36250.0</td>\n",
       "      <td>24.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>旅游大数据分析师（杭州）</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>1.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>智能数据分析引擎研发专家</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>3.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>浙江数据分析师</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>解决方案顾问/数据分析师</th>\n",
       "      <td>25000.0</td>\n",
       "      <td>4.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>财务数据分析师</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>4.50</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>资深数据分析/数据分析专家G00796</th>\n",
       "      <td>45000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>资深数据分析专员</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>1.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>资深数据分析师</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>6.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>资深数据分析师 (MJ000088)</th>\n",
       "      <td>25000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>资深数据分析师（商品方向）G01053</th>\n",
       "      <td>45000.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>资深数据分析师（杭州）</th>\n",
       "      <td>37500.0</td>\n",
       "      <td>15.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>金融数据分析师</th>\n",
       "      <td>22500.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>银行数据分析岗</th>\n",
       "      <td>50000.0</td>\n",
       "      <td>5.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>高级数据分析专员</th>\n",
       "      <td>22500.0</td>\n",
       "      <td>4.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>高级数据分析师</th>\n",
       "      <td>30000.0</td>\n",
       "      <td>3.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>高级财务数据分析师</th>\n",
       "      <td>28750.0</td>\n",
       "      <td>4.50</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                         salary_median  score_mean\n",
       "positionName                                      \n",
       "BI数据分析师                        20000.0        2.67\n",
       "bi数据分析师                        40000.0        5.00\n",
       "业务与数据分析师                       30000.0        3.00\n",
       "产品经理/数据分析（核心业务）-2020届春招        60000.0        3.00\n",
       "产品运营（偏数据分析）                    27500.0       15.00\n",
       "商业数据分析                         35000.0        0.00\n",
       "商业数据分析师                        37500.0        5.00\n",
       "商业数据分析师（阿里数据银行）                22500.0        3.00\n",
       "大数据分析工程师(J11108)               30000.0       17.00\n",
       "大数据建模总监                        37500.0       14.00\n",
       "奔驰·耀出行-BI数据分析专家                30000.0        0.00\n",
       "奔驰耀出行-战略数据分析师                  42500.0        1.00\n",
       "店铺数据分析师                        30000.0        6.00\n",
       "数据分析                           30000.0       82.71\n",
       "数据分析-2020届春招                   30000.0        4.00\n",
       "数据分析专员                         26250.0        3.00\n",
       "数据分析专家                         31250.0        8.17\n",
       "数据分析专家-LQ(J181203029)          21500.0        0.00\n",
       "数据分析专家03-10-217                23750.0        4.50\n",
       "数据分析专家（游戏业务）                   37500.0       12.00\n",
       "数据分析实习生                        40000.0        4.00\n",
       "数据分析实习生 (MJ000087)             26500.0        3.00\n",
       "数据分析工程师                        20000.0       16.00\n",
       "数据分析师                          37500.0        6.50\n",
       "数据分析师 (MJ000250)               27500.0        4.50\n",
       "数据分析师(J10147)                  37500.0        3.00\n",
       "数据分析师-Lark                     30000.0        2.00\n",
       "数据分析师-企业SaaS应用                 40000.0        2.00\n",
       "数据分析师/BI                       45000.0        5.00\n",
       "数据分析师（保险）13-01-19              40000.0        4.00\n",
       "数据分析师（社招）                      30000.0       15.00\n",
       "数据分析师（财务方向）                    37500.0        5.00\n",
       "数据分析建模工程师                      30000.0        0.00\n",
       "数据分析建模工程师（校招）                  36500.0        0.00\n",
       "数据分析经理                         30000.0        6.50\n",
       "数据分析负责人 or 数据分析师               30000.0        4.00\n",
       "数据建模                           15000.0      176.00\n",
       "数据建模专家-杭州-01546                30000.0       12.00\n",
       "数据建模工程师                        36250.0       24.00\n",
       "旅游大数据分析师（杭州）                   30000.0        1.00\n",
       "智能数据分析引擎研发专家                   30000.0        3.00\n",
       "浙江数据分析师                        37500.0        5.00\n",
       "解决方案顾问/数据分析师                   25000.0        4.50\n",
       "财务数据分析师                        37500.0        4.50\n",
       "资深数据分析/数据分析专家G00796            45000.0        4.00\n",
       "资深数据分析专员                       30000.0        1.00\n",
       "资深数据分析师                        30000.0        6.67\n",
       "资深数据分析师 (MJ000088)             25000.0        4.00\n",
       "资深数据分析师（商品方向）G01053            45000.0        4.00\n",
       "资深数据分析师（杭州）                    37500.0       15.00\n",
       "金融数据分析师                        22500.0        5.00\n",
       "银行数据分析岗                        50000.0        5.00\n",
       "高级数据分析专员                       22500.0        4.00\n",
       "高级数据分析师                        30000.0        3.67\n",
       "高级财务数据分析师                      28750.0        4.50"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby(by='positionName').agg(\n",
    "   salary_median=('salary','median'),\n",
    "   score_mean=('score','mean')\n",
    ").round(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 20 -聚合统计｜多层\n",
    "\n",
    "对不同行政区进行分组，并统计薪水的均值、中位数、方差，以及得分的均值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "district",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "('salary', 'mean')",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "('salary', 'median')",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "('salary', 'std')",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "('score', 'mean')",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "63d1a7b5-423b-4dc0-9ff8-4f7fe3fc3067",
       "rows": [
        [
         "上城区",
         "26250.0",
         "26250.0",
         "5303.300858899106",
         "2.0"
        ],
        [
         "下沙",
         "30000.0",
         "30000.0",
         null,
         "6.0"
        ],
        [
         "余杭区",
         "33583.333333333336",
         "30000.0",
         "10857.847721480402",
         "15.166666666666666"
        ],
        [
         "拱墅区",
         "28500.0",
         "30000.0",
         "3000.0",
         "2.75"
        ],
        [
         "江干区",
         "25250.0",
         "26250.0",
         "17255.433926737398",
         "39.25"
        ],
        [
         "滨江区",
         "31428.571428571428",
         "30000.0",
         "10445.436460825505",
         "12.952380952380953"
        ],
        [
         "萧山区",
         "36250.0",
         "37500.0",
         "10307.764064044151",
         "18.25"
        ],
        [
         "西湖区",
         "30893.939393939392",
         "30000.0",
         "7962.566302468827",
         "8.06060606060606"
        ]
       ],
       "shape": {
        "columns": 4,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th colspan=\"3\" halign=\"left\">salary</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>mean</th>\n",
       "      <th>median</th>\n",
       "      <th>std</th>\n",
       "      <th>mean</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>district</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>上城区</th>\n",
       "      <td>26250.000000</td>\n",
       "      <td>26250.0</td>\n",
       "      <td>5303.300859</td>\n",
       "      <td>2.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>下沙</th>\n",
       "      <td>30000.000000</td>\n",
       "      <td>30000.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>余杭区</th>\n",
       "      <td>33583.333333</td>\n",
       "      <td>30000.0</td>\n",
       "      <td>10857.847721</td>\n",
       "      <td>15.166667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>拱墅区</th>\n",
       "      <td>28500.000000</td>\n",
       "      <td>30000.0</td>\n",
       "      <td>3000.000000</td>\n",
       "      <td>2.750000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>江干区</th>\n",
       "      <td>25250.000000</td>\n",
       "      <td>26250.0</td>\n",
       "      <td>17255.433927</td>\n",
       "      <td>39.250000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>滨江区</th>\n",
       "      <td>31428.571429</td>\n",
       "      <td>30000.0</td>\n",
       "      <td>10445.436461</td>\n",
       "      <td>12.952381</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>萧山区</th>\n",
       "      <td>36250.000000</td>\n",
       "      <td>37500.0</td>\n",
       "      <td>10307.764064</td>\n",
       "      <td>18.250000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西湖区</th>\n",
       "      <td>30893.939394</td>\n",
       "      <td>30000.0</td>\n",
       "      <td>7962.566302</td>\n",
       "      <td>8.060606</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                salary                             score\n",
       "                  mean   median           std       mean\n",
       "district                                                \n",
       "上城区       26250.000000  26250.0   5303.300859   2.000000\n",
       "下沙        30000.000000  30000.0           NaN   6.000000\n",
       "余杭区       33583.333333  30000.0  10857.847721  15.166667\n",
       "拱墅区       28500.000000  30000.0   3000.000000   2.750000\n",
       "江干区       25250.000000  26250.0  17255.433927  39.250000\n",
       "滨江区       31428.571429  30000.0  10445.436461  12.952381\n",
       "萧山区       36250.000000  37500.0  10307.764064  18.250000\n",
       "西湖区       30893.939394  30000.0   7962.566302   8.060606"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.groupby('district').agg({\n",
    "    'salary':['mean','median','std'],\n",
    "    'score':'mean'\n",
    "}).round(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 21 - 聚合统计｜自定义函数\n",
    "\n",
    "在 18 题基础上，在聚合计算时新增一列计算最大值与平均值的差值"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "application/vnd.microsoft.datawrangler.viewer.v0+json": {
       "columns": [
        {
         "name": "行政区",
         "rawType": "object",
         "type": "string"
        },
        {
         "name": "最低工资",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "最高工资",
         "rawType": "int64",
         "type": "integer"
        },
        {
         "name": "平均工资",
         "rawType": "float64",
         "type": "float"
        },
        {
         "name": "最大值与均值差值",
         "rawType": "float64",
         "type": "float"
        }
       ],
       "ref": "7c5b9bc0-2d3e-44af-8d13-3228036f5589",
       "rows": [
        [
         "上城区",
         "22500",
         "30000",
         "26250.0",
         "3750.0"
        ],
        [
         "下沙",
         "30000",
         "30000",
         "30000.0",
         "0.0"
        ],
        [
         "余杭区",
         "7500",
         "60000",
         "33583.33",
         "26416.67"
        ],
        [
         "拱墅区",
         "24000",
         "30000",
         "28500.0",
         "1500.0"
        ],
        [
         "江干区",
         "3500",
         "45000",
         "25250.0",
         "19750.0"
        ],
        [
         "滨江区",
         "7500",
         "50000",
         "31428.57",
         "18571.43"
        ],
        [
         "萧山区",
         "25000",
         "45000",
         "36250.0",
         "8750.0"
        ],
        [
         "西湖区",
         "6500",
         "45000",
         "30893.94",
         "14106.06"
        ]
       ],
       "shape": {
        "columns": 4,
        "rows": 8
       }
      },
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>最低工资</th>\n",
       "      <th>最高工资</th>\n",
       "      <th>平均工资</th>\n",
       "      <th>最大值与均值差值</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>行政区</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>上城区</th>\n",
       "      <td>22500</td>\n",
       "      <td>30000</td>\n",
       "      <td>26250.00</td>\n",
       "      <td>3750.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>下沙</th>\n",
       "      <td>30000</td>\n",
       "      <td>30000</td>\n",
       "      <td>30000.00</td>\n",
       "      <td>0.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>余杭区</th>\n",
       "      <td>7500</td>\n",
       "      <td>60000</td>\n",
       "      <td>33583.33</td>\n",
       "      <td>26416.67</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>拱墅区</th>\n",
       "      <td>24000</td>\n",
       "      <td>30000</td>\n",
       "      <td>28500.00</td>\n",
       "      <td>1500.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>江干区</th>\n",
       "      <td>3500</td>\n",
       "      <td>45000</td>\n",
       "      <td>25250.00</td>\n",
       "      <td>19750.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>滨江区</th>\n",
       "      <td>7500</td>\n",
       "      <td>50000</td>\n",
       "      <td>31428.57</td>\n",
       "      <td>18571.43</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>萧山区</th>\n",
       "      <td>25000</td>\n",
       "      <td>45000</td>\n",
       "      <td>36250.00</td>\n",
       "      <td>8750.00</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>西湖区</th>\n",
       "      <td>6500</td>\n",
       "      <td>45000</td>\n",
       "      <td>30893.94</td>\n",
       "      <td>14106.06</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      最低工资   最高工资      平均工资  最大值与均值差值\n",
       "行政区                                  \n",
       "上城区  22500  30000  26250.00   3750.00\n",
       "下沙   30000  30000  30000.00      0.00\n",
       "余杭区   7500  60000  33583.33  26416.67\n",
       "拱墅区  24000  30000  28500.00   1500.00\n",
       "江干区   3500  45000  25250.00  19750.00\n",
       "滨江区   7500  50000  31428.57  18571.43\n",
       "萧山区  25000  45000  36250.00   8750.00\n",
       "西湖区   6500  45000  30893.94  14106.06"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(df.groupby('district')\n",
    "     .agg(\n",
    "         最低工资=('salary','min'),\n",
    "         最高工资=('salary','max'),\n",
    "         平均工资=('salary','mean'),\n",
    "         最大值与均值差值=('salary',lambda x:x.max()-x.mean())\n",
    "     ) # 聚合同时重命名列\n",
    "     .round(2)\n",
    "     .rename_axis('行政区') # 直接修改索引名\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](http://liuzaoqi.oss-cn-beijing.aliyuncs.com/2021/09/16/16317972442543.jpg?域名/sample.jpg?x-oss-process=style/stylename)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "384px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
